通过应用于肠道微生物组数据的多视图堆叠泛化诊断炎症性肠病和结直肠癌

Most of the microbiome studies suggest that using ensemble models such as Random Forest results in best predictive power. In this study, we empirically evaluate a more powerful ensemble learning algorithm, multi-view stacked generalization, on pediatric inflammatory bowel disease and adult colorectal cancer patients' cohorts. We aim to check whether stacking would lead to better results compared to using a single best machine learning algorithm. Stacking achieves the best test set Average Precision (AP) on inflammatory bowel disease dataset reaching AP = 0.69, outperforming both the best base classifier (AP = 0.61) and the baseline meta learner built on top of base classifiers (AP = 0.63). On colorectal cancer dataset, the stacked classifier also outperforms (AP = 0.81) both the best base classifier (AP = 0.79) and the baseline meta learner (AP = 0.75). Stacking achieves best predictive performance on test set outperforming the best classifiers on both patient cohorts. Application of the stacking solves the issue of choosing the most appropriate machine learning algorithm by automating the model selection procedure. Clinical application of such a model is not limited to diagnosis task only, but it also can be extended to biomarker selection thanks to feature selection procedure.

大多数微生物组研究表明，使用随机森林等集成模型可产生最佳预测能力。在本研究中，我们对儿科炎症性肠病和成年结直肠癌患者队列，实证评估了一种更强大的集成学习算法——多视图堆叠泛化。我们旨在检验与使用单一最佳机器学习算法相比，堆叠是否会带来更好的结果。堆叠在炎症性肠病数据集上实现了最佳测试集平均精度（AP），达到AP = 0.69，优于最佳基础分类器（AP = 0.61）和基于基础分类器构建的基线元学习器（AP = 0.63）。在结直肠癌数据集上，堆叠分类器也优于最佳基础分类器（AP = 0.79）和基线元学习器（AP = 0.75）（AP = 0.81）。堆叠在测试集上实现了最佳预测性能，优于两个患者队列中的最佳分类器。堆叠的应用通过自动化模型选择过程解决了选择最合适机器学习算法的问题。这种模型的临床应用不仅限于诊断任务，由于特征选择过程，它还可以扩展到生物标志物选择。

新学期，新优惠

Suppr 超能文献

新学期，新优惠

Suppr 超能文献

Diagnosis of Inflammatory Bowel Disease and Colorectal Cancer through Multi-View Stacked Generalization Applied on Gut Microbiome Data.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

推荐工具