Strategy for minimizing between-study variation of large-scale phenotypic experiments using multivariate analysis.

Pinto RC, Gerber L, Eliasson M, Sundberg B, Trygg J

Anal. Chem. 84 (20) 8675-8681 [2012-10-16; online 2012-09-18]

We have developed a multistep strategy that integrates data from several large-scale experiments that suffer from systematic between-experiment variation. This strategy removes such variation that would otherwise mask differences of interest. It was applied to the evaluation of wood chemical analysis of 736 hybrid aspen trees: wild-type controls and transgenic trees potentially involved in wood formation. The trees were grown in four different greenhouse experiments imposing significant variation between experiments. Pyrolysis coupled to gas chromatography/mass spectrometry (Py-GC/MS) was used as a high throughput-screening platform for fingerprinting of wood chemotype. Our proposed strategy includes quality control, outlier detection, gene specific classification, and consensus analysis. The orthogonal projections to latent structures discriminant analysis (OPLS-DA) method was used to generate the consensus chemotype profiles for each transgenic line. These were thereafter compiled to generate a global dataset. Multivariate analysis and cluster analysis techniques revealed a drastic reduction in between-experiment variation that enabled a global analysis of all transgenic lines from the four independent experiments. Information from in-depth analysis of specific transgenic lines and independent peak identification validated our proposed strategy.

Bioinformatics Support and Infrastructure

Bioinformatics Support, Infrastructure and Training

QC bibliography QC xrefs

PubMed 22978754

DOI 10.1021/ac301869p

Crossref 10.1021/ac301869p