Joint and unique multiblock analysis of biological data - multiomics malaria study.

Surowiec I, Skotare T, Sjögren R, Gouveia-Figueira S, Orikiiriza J, Bergström S, Normark J, Trygg J

Faraday Discuss. 218 (0) 268-283 [2019-08-15; online 2019-05-24]

Modern profiling technologies enable us to obtain large amounts of data which can be used later for a comprehensive understanding of the studied system. Proper evaluation of such data is challenging, and cannot be carried out by bare analysis of separate data sets. Integrated approaches are necessary, because only data integration allows us to find correlation trends common for all studied data sets and reveal hidden structures not known a priori. This improves the understanding and interpretation of complex systems. Joint and Unique MultiBlock Analysis (JUMBA) is an analysis method based on the OnPLS-algorithm that decomposes a set of matrices into joint parts containing variations shared with other connected matrices and variations that are unique for each single matrix. Mapping unique variations is important from a data integration perspective, since it certainly cannot be expected that all variation co-varies. In this work we used JUMBA for the integrated analysis of lipidomic, metabolomic and oxylipins data sets obtained from profiling of plasma samples from children infected with P. falciparum malaria. P. falciparum is one of the primary contributors to childhood mortality and obstetric complications in the developing world, which makes the development of new diagnostic and prognostic tools, as well as a better understanding of the disease, of utmost importance. In the presented work, JUMBA made it possible to detect already known trends related to the disease progression, but also to discover new structures in the data connected to food intake and personal differences in metabolism. By separating the variation in each data set into joint and unique, JUMBA reduced the complexity of the analysis and facilitated the detection of samples and variables corresponding to specific structures across multiple data sets, and by doing this enabled fast interpretation of the studied system. All of this makes JUMBA a perfect choice for multiblock analysis of systems biology data.

