Strategy for minimizing between-study variation of large-scale phenotypic experiments using multivariate analysis.

Pinto RC, Gerber L, Eliasson M, Sundberg B, Trygg J

Anal. Chem. 84 (20) 8675-8681 [2012-10-16; online 2012-09-18]

We have developed a multistep strategy that integrates data from several large-scale experiments that suffer from systematic between-experiment variation. This strategy removes such variation that would otherwise mask differences of interest. It was applied to the evaluation of wood chemical analysis of 736 hybrid aspen trees: wild-type controls and transgenic trees potentially involved in wood formation. The trees were grown in four different greenhouse experiments imposing significant variation between experiments. Pyrolysis coupled to gas chromatography/mass spectrometry (Py-GC/MS) was used as a high throughput-screening platform for fingerprinting of wood chemotype. Our proposed strategy includes quality control, outlier detection, gene specific classification, and consensus analysis. The orthogonal projections to latent structures discriminant analysis (OPLS-DA) method was used to generate the consensus chemotype profiles for each transgenic line. These were thereafter compiled to generate a global dataset. Multivariate analysis and cluster analysis techniques revealed a drastic reduction in between-experiment variation that enabled a global analysis of all transgenic lines from the four independent experiments. Information from in-depth analysis of specific transgenic lines and independent peak identification validated our proposed strategy.

Bioinformatics Support and Infrastructure

Bioinformatics Support, Infrastructure and Training

PubMed 22978754

DOI 10.1021/ac301869p

Crossref 10.1021/ac301869p