Iablokov Stanislav N, Klimenko Natalia S, Efimova Daria A, Shashkova Tatiana, Novichkov Pavel S, Rodionov Dmitry A, Tyakht Alexander V
A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia.
P.G. Demidov Yaroslavl State University, Yaroslavl, Russia.
Front Mol Biosci. 2021 Jan 18;7:603740. doi: 10.3389/fmolb.2020.603740. eCollection 2020.
The gut microbiome is of utmost importance to human health. While a healthy microbiome can be represented by a variety of structures, its functional capacity appears to be more important. Gene content of the community can be assessed by "shotgun" metagenomics, but this approach is still too expensive. High-throughput amplicon-based surveys are a method of choice for large-scale surveys of links between microbiome, diseases, and diet, but the algorithms for predicting functional composition need to be improved to achieve good precision. Here we show how feature engineering based on microbial phenotypes, an advanced method for functional prediction from 16S rRNA sequencing data, improves identification of alterations of the gut microbiome linked to the disease. We processed a large collection of published gut microbial datasets of inflammatory bowel disease (IBD) patients to derive their community phenotype indices (CPI)-high-precision semiquantitative profiles aggregating metabolic potential of the community members based on genome-wide metabolic reconstructions. The list of selected metabolic functions included metabolism of short-chain fatty acids, vitamins, and carbohydrates. The machine-learning approach based on microbial phenotypes allows us to distinguish the microbiome profiles of healthy controls from patients with Crohn's disease and from ones with ulcerative colitis. The classifiers were comparable in quality to conventional taxonomy-based classifiers but provided new findings giving insights into possible mechanisms of pathogenesis. Feature-wise partial dependence plot (PDP) analysis of contribution to the classification result revealed a diversity of patterns. These observations suggest a constructive basis for defining functional homeostasis of the healthy human gut microbiome. The developed features are promising interpretable candidate biomarkers for assessing microbiome contribution to disease risk for the purposes of personalized medicine and clinical trials.
肠道微生物群对人类健康至关重要。虽然健康的微生物群可以由多种结构来表征,但其功能能力似乎更为重要。群落的基因含量可以通过“鸟枪法”宏基因组学来评估,但这种方法仍然过于昂贵。基于高通量扩增子的调查是大规模研究微生物群、疾病和饮食之间联系的一种首选方法,但预测功能组成的算法需要改进以实现高精度。在这里,我们展示了基于微生物表型的特征工程,一种从16S rRNA测序数据进行功能预测的先进方法,如何改进与疾病相关的肠道微生物群改变的识别。我们处理了大量已发表的炎症性肠病(IBD)患者的肠道微生物数据集,以得出他们的群落表型指数(CPI)——基于全基因组代谢重建汇总群落成员代谢潜力的高精度半定量概况。所选代谢功能列表包括短链脂肪酸、维生素和碳水化合物的代谢。基于微生物表型的机器学习方法使我们能够区分健康对照者与克罗恩病患者以及溃疡性结肠炎患者的微生物群概况。这些分类器在质量上与传统的基于分类学的分类器相当,但提供了新的发现,有助于深入了解发病机制的可能机制。对分类结果贡献的特征-wise部分依赖图(PDP)分析揭示了多种模式。这些观察结果为定义健康人类肠道微生物群的功能稳态提供了建设性基础。所开发出的特征有望成为可解释的候选生物标志物,用于在个性化医疗和临床试验中评估微生物群对疾病风险的影响。