Commun Biol 3 (1) 119 [2020-03-13; online 2020-03-13]
The genome encodes the metabolic and functional capabilities of an organism and should be a major determinant of its ecological niche. Yet, it is unknown if the niche can be predicted directly from the genome. Here, we conduct metagenomic binning on 123 water samples spanning major environmental gradients of the Baltic Sea. The resulting 1961 metagenome-assembled genomes represent 352 species-level clusters that correspond to 1/3 of the metagenome sequences of the prokaryotic size-fraction. By using machine-learning, the placement of a genome cluster along various niche gradients (salinity level, depth, size-fraction) could be predicted based solely on its functional genes. The same approach predicted the genomes' placement in a virtual niche-space that captures the highest variation in distribution patterns. The predictions generally outperformed those inferred from phylogenetic information. Our study demonstrates a strong link between genome and ecological niche and provides a conceptual framework for predictive ecology based on genomic data.