Department of Gene Technology, Science for Life Laboratory, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH Royal Institute of Technology, Stockholm, Sweden.
Leibniz Institute for Baltic Sea Research, Warnemünde, Germany.
Commun Biol. 2020 Mar 13;3(1):119. doi: 10.1038/s42003-020-0856-x.
The genome encodes the metabolic and functional capabilities of an organism and should be a major determinant of its ecological niche. Yet, it is unknown if the niche can be predicted directly from the genome. Here, we conduct metagenomic binning on 123 water samples spanning major environmental gradients of the Baltic Sea. The resulting 1961 metagenome-assembled genomes represent 352 species-level clusters that correspond to 1/3 of the metagenome sequences of the prokaryotic size-fraction. By using machine-learning, the placement of a genome cluster along various niche gradients (salinity level, depth, size-fraction) could be predicted based solely on its functional genes. The same approach predicted the genomes' placement in a virtual niche-space that captures the highest variation in distribution patterns. The predictions generally outperformed those inferred from phylogenetic information. Our study demonstrates a strong link between genome and ecological niche and provides a conceptual framework for predictive ecology based on genomic data.
基因组编码了生物体的代谢和功能能力,应该是其生态位的主要决定因素。然而,目前还不清楚是否可以直接从基因组预测生态位。在这里,我们对跨越波罗的海主要环境梯度的 123 个水样进行了宏基因组bin 分析。由此产生的 1961 个宏基因组组装基因组代表了 352 个种水平的聚类,对应于原核大小分数的宏基因组序列的 1/3。通过使用机器学习,可以仅根据功能基因来预测基因组聚类在各种生态位梯度(盐度水平、深度、大小分数)上的位置。同样的方法预测了基因组在虚拟生态位空间中的位置,该空间捕获了分布模式的最大变化。预测结果通常优于基于系统发育信息推断的结果。我们的研究表明基因组和生态位之间存在很强的联系,并为基于基因组数据的预测生态学提供了一个概念框架。