Malmström L, Bakochi A, Svensson G, Kilsgård O, Lantz H, Petersson AC, Hauri S, Karlsson C, Malmström J
J Proteomics 129 (-) 98-107 [2015-11-03; online 2015-09-19]
The increasing number of bacterial genomes in combination with reproducible quantitative proteome measurements provides new opportunities to explore how genetic differences modulate proteome composition and virulence. It is challenging to combine genome and proteome data as the underlying genome influences the proteome. We present a strategy to facilitate the integration of genome data from several genetically similar bacterial strains with data-independent analysis mass spectrometry (DIA-MS) for rapid interrogation of the combined data sets. The strategy relies on the construction of a composite genome combining all genetic data in a compact format, which can accommodate the fusion with quantitative peptide and protein information determined via DIA-MS. We demonstrate the method by combining data sets from whole genome sequencing, shotgun MS and DIA-MS from 34 clinical isolates of Streptococcus pyogenes. The data structure allows for fast exploration of the data showing that undetected proteins are on average more amenable to amino acid substitution than expressed proteins. We identified several significantly differentially expressed proteins between invasive and non-invasive strains. The work underlines how integration of whole genome sequencing with accurately quantified proteomes can further advance the interpretation of the relationship between genomes, proteomes and virulence. This article is part of a Special Issue entitled: Computational Proteomics.