COVAIN

Covain: a toolbox for uni- and multivariate statistics, time- series and correlation network analysis and inverse estimation of the differential jacobian from metabolomics covariance data. Metabolomics emerges as one of the cornerstones in systems biology by characterizing metabolic activities as the ultimate readout of physiological processes of biological systems thereby linking genotypes with the corresponding phenotypes. As metabolomics data are high-dimensional, statistical data analysis is complex. No single technique for statistical analysis and biological interpretation of these ultracomplex data is sufficient to reveal the full information content of the data. Therefore a combination of univariate and multivariate statistics, network topology and biochemical pathway mapping analysis is in all cases recommended. Therefore, we developed a toolbox with fully graphical user interface support in MATLAB© called covariance inverse (COVAIN). COVAIN provides a complete workflow including uploading data, data preprocessing, uni- and multivariate statistical analysis, Granger time-series analysis, pathway mapping, correlation network topology analysis and visualization, and finally saving results in a user-friendly way. It covers analysis of variance, principal components analysis, independent components analysis, clustering and correlation coefficient analysis and integrates new algorithms, such as Granger causality and permutation entropy analysis that are not implemented in other similar softwares. Furthermore, we provide a new algorithm to reconstruct a differential Jacobian matrix of two different metabolic conditions. The algorithm is based on the assumptions of stochastic fluctuations in the metabolic network as described by us recently. By integrating the metabolomics covariance matrix and the stoichiometric matrix N of the corresponding pathways this approach allows for a systematic investigation of perturbation sites in the biochemical network based on metabolomics data. COVAIN was primarily developed for metabolomics data but can also be used for other omics data analysis. A C language programming module was integrated to handle computational intensive work for large datasets, e.g., genome-level proteomics and transcriptomics data sets which usually contain several thousand or more variables. COVAIN can perform cross analysis and integration between several datasets, which might be useful to investigate responses on different hierarchies of cellular contexts and to reveal the systems response as an integrated molecular network. The source codes can be downloaded from http://www.univie.ac.at/mosys/software.html.