From the Bioinformatics and Artificial Intelligence Laboratory, Center for Hypertension and Precision Medicine, Program in Physiological Genomics, Department of Physiology and Pharmacology, University of Toledo College of Medicine and Life Sciences, Toledo, OH.
Hypertension. 2020 Nov;76(5):1555-1562. doi: 10.1161/HYPERTENSIONAHA.120.15885. Epub 2020 Sep 10.
Cardiovascular disease (CVD) is the number one leading cause for human mortality. Besides genetics and environmental factors, in recent years, gut microbiota has emerged as a new factor influencing CVD. Although cause-effect relationships are not clearly established, the reported associations between alterations in gut microbiota and CVD are prominent. Therefore, we hypothesized that machine learning (ML) could be used for gut microbiome-based diagnostic screening of CVD. To test our hypothesis, fecal 16S ribosomal RNA sequencing data of 478 CVD and 473 non-CVD human subjects collected through the American Gut Project were analyzed using 5 supervised ML algorithms including random forest, support vector machine, decision tree, elastic net, and neural networks. Thirty-nine differential bacterial taxa were identified between the CVD and non-CVD groups. ML modeling using these taxonomic features achieved a testing area under the receiver operating characteristic curve (0.0, perfect antidiscrimination; 0.5, random guessing; 1.0, perfect discrimination) of ≈0.58 (random forest and neural networks). Next, the ML models were trained with the top 500 high-variance features of operational taxonomic units, instead of bacterial taxa, and an improved testing area under the receiver operating characteristic curves of ≈0.65 (random forest) was achieved. Further, by limiting the selection to only the top 25 highly contributing operational taxonomic unit features, the area under the receiver operating characteristic curves was further significantly enhanced to ≈0.70. Overall, our study is the first to identify dysbiosis of gut microbiota in CVD patients as a group and apply this knowledge to develop a gut microbiome-based ML approach for diagnostic screening of CVD.
心血管疾病(CVD)是人类死亡的首要原因。除了遗传和环境因素外,近年来,肠道微生物群已成为影响 CVD 的一个新因素。尽管因果关系尚不清楚,但已报道的肠道微生物群的改变与 CVD 之间的关联是显著的。因此,我们假设机器学习(ML)可用于基于肠道微生物组对 CVD 进行诊断筛查。为了验证我们的假设,通过美国肠道计划收集了 478 例 CVD 和 473 例非 CVD 人类受试者的粪便 16S 核糖体 RNA 测序数据,使用包括随机森林、支持向量机、决策树、弹性网络和神经网络在内的 5 种监督 ML 算法进行了分析。在 CVD 和非 CVD 组之间鉴定出 39 个差异细菌分类群。使用这些分类特征进行 ML 建模,测试中的受试者工作特征曲线下面积(0.0,完全区分;0.5,随机猜测;1.0,完全区分)约为 0.58(随机森林和神经网络)。接下来,使用操作分类单元的前 500 个高方差特征(而不是细菌分类群)对 ML 模型进行训练,并实现了测试中受试者工作特征曲线下面积的提高,约为 0.65(随机森林)。此外,通过仅选择前 25 个高度贡献的操作分类单元特征,受试者工作特征曲线下的面积进一步显著提高至约 0.70。总体而言,我们的研究首次鉴定出 CVD 患者肠道微生物群的失调,并将这一知识应用于开发基于肠道微生物组的 ML 方法来诊断 CVD。