Suppr超能文献

基于机器学习的全球饮食数据结直肠癌预测。

Machine learning-based colorectal cancer prediction using global dietary data.

机构信息

University of Michigan, Ann Arbor, USA.

PAPRSB Institute of Health Sciences, Universiti Brunei Darussalam, , Bandar Seri Begawan, Brunei.

出版信息

BMC Cancer. 2023 Feb 10;23(1):144. doi: 10.1186/s12885-023-10587-x.

Abstract

BACKGROUND

Colorectal cancer (CRC) is the third most commonly diagnosed cancer worldwide. Active health screening for CRC yielded detection of an increasingly younger adults. However, current machine learning algorithms that are trained using older adults and smaller datasets, may not perform well in practice for large populations.

AIM

To evaluate machine learning algorithms using large datasets accounting for both younger and older adults from multiple regions and diverse sociodemographics.

METHODS

A large dataset including 109,343 participants in a dietary-based colorectal cancer ase study from Canada, India, Italy, South Korea, Mexico, Sweden, and the United States was collected by the Center for Disease Control and Prevention. This global dietary database was augmented with other publicly accessible information from multiple sources. Nine supervised and unsupervised machine learning algorithms were evaluated on the aggregated dataset.

RESULTS

Both supervised and unsupervised models performed well in predicting CRC and non-CRC phenotypes. A prediction model based on an artificial neural network (ANN) was found to be the optimal algorithm with CRC misclassification of 1% and non-CRC misclassification of 3%.

CONCLUSIONS

ANN models trained on large heterogeneous datasets may be applicable for both younger and older adults. Such models provide a solid foundation for building effective clinical decision support systems assisting healthcare providers in dietary-related, non-invasive screening that can be applied in large studies. Using optimal algorithms coupled with high compliance to cancer screening is expected to significantly improve early diagnoses and boost the success rate of timely and appropriate cancer interventions.

摘要

背景

结直肠癌(CRC)是全球第三大常见癌症。对 CRC 的主动健康筛查发现了越来越多的年轻成年人患病。然而,目前使用老年人和较小数据集训练的机器学习算法在实践中可能无法很好地适用于大量人群。

目的

评估使用来自多个地区和不同社会人口统计学数据的大型数据集以及涵盖年轻和老年成年人的机器学习算法。

方法

从加拿大、印度、意大利、韩国、墨西哥、瑞典和美国的疾病预防控制中心收集了一项基于饮食的结直肠癌病例研究的 109343 名参与者的大型数据集。该全球饮食数据库通过多个来源的其他公开可访问信息进行了扩充。在汇总数据集上评估了九种有监督和无监督机器学习算法。

结果

有监督和无监督模型在预测 CRC 和非 CRC 表型方面均表现良好。基于人工神经网络 (ANN) 的预测模型被发现是最佳算法,CRC 分类错误率为 1%,非 CRC 分类错误率为 3%。

结论

基于大型异质数据集训练的 ANN 模型可能适用于年轻和老年成年人。这些模型为构建有效的临床决策支持系统提供了坚实的基础,有助于医疗保健提供者进行与饮食相关的非侵入性筛查,可应用于大型研究。结合高合规性使用最佳算法有望显著提高早期诊断率,并提高及时和适当的癌症干预措施的成功率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/046d/9921106/8cd533d5689b/12885_2023_10587_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验