Imangaliyev Sultan, Schlötterer Jörg, Meyer Folker, Seifert Christin
Institute for Artificial Intelligence in Medicine, University of Duisburg-Essen, 45131 Essen, Germany.
Cancer Research Center Cologne Essen (CCCE), 45147 Essen, Germany.
Diagnostics (Basel). 2022 Oct 17;12(10):2514. doi: 10.3390/diagnostics12102514.
Most of the microbiome studies suggest that using ensemble models such as Random Forest results in best predictive power. In this study, we empirically evaluate a more powerful ensemble learning algorithm, multi-view stacked generalization, on pediatric inflammatory bowel disease and adult colorectal cancer patients' cohorts. We aim to check whether stacking would lead to better results compared to using a single best machine learning algorithm. Stacking achieves the best test set Average Precision (AP) on inflammatory bowel disease dataset reaching AP = 0.69, outperforming both the best base classifier (AP = 0.61) and the baseline meta learner built on top of base classifiers (AP = 0.63). On colorectal cancer dataset, the stacked classifier also outperforms (AP = 0.81) both the best base classifier (AP = 0.79) and the baseline meta learner (AP = 0.75). Stacking achieves best predictive performance on test set outperforming the best classifiers on both patient cohorts. Application of the stacking solves the issue of choosing the most appropriate machine learning algorithm by automating the model selection procedure. Clinical application of such a model is not limited to diagnosis task only, but it also can be extended to biomarker selection thanks to feature selection procedure.
大多数微生物组研究表明,使用随机森林等集成模型可产生最佳预测能力。在本研究中,我们对儿科炎症性肠病和成年结直肠癌患者队列,实证评估了一种更强大的集成学习算法——多视图堆叠泛化。我们旨在检验与使用单一最佳机器学习算法相比,堆叠是否会带来更好的结果。堆叠在炎症性肠病数据集上实现了最佳测试集平均精度(AP),达到AP = 0.69,优于最佳基础分类器(AP = 0.61)和基于基础分类器构建的基线元学习器(AP = 0.63)。在结直肠癌数据集上,堆叠分类器也优于最佳基础分类器(AP = 0.79)和基线元学习器(AP = 0.75)(AP = 0.81)。堆叠在测试集上实现了最佳预测性能,优于两个患者队列中的最佳分类器。堆叠的应用通过自动化模型选择过程解决了选择最合适机器学习算法的问题。这种模型的临床应用不仅限于诊断任务,由于特征选择过程,它还可以扩展到生物标志物选择。
Sensors (Basel). 2015-12-25
Int J Med Inform. 2023-7
Healthc Inform Res. 2019-10
Sci Total Environ. 2024-7-20
Comput Methods Programs Biomed. 2021-1
Diagnostics (Basel). 2022-8-24
Diagnostics (Basel). 2022-6-12
Diagnostics (Basel). 2022-4-4
Diagnostics (Basel). 2021-12-24
Int J Environ Res Public Health. 2021-11-27