• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用临床实验室数据进行结直肠癌预测的可解释机器学习模型

Explainable Machine Learning Models for Colorectal Cancer Prediction Using Clinical Laboratory Data.

作者信息

Li Rui, Hao Xiaoyan, Diao Yanjun, Yang Liu, Liu Jiayun

机构信息

Department of Clinical Laboratory Medicine, Xijing Hospital, Air Force Medical University, Xi'an, China.

出版信息

Cancer Control. 2025 Jan-Dec;32:10732748251336417. doi: 10.1177/10732748251336417. Epub 2025 May 7.

DOI:10.1177/10732748251336417
PMID:40334702
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12062600/
Abstract

IntroductionEarly diagnosis of colorectal cancer (CRC) poses a significant clinical challenge. This study aims to develop machine learning (ML) models for CRC risk prediction using clinical laboratory data.MethodsThis retrospective, single-center study analyzed laboratory examination data from healthy controls (HC), polyp patients (Polyp), and CRC patients between 2013 and 2023. Five ML algorithms, including adaptive boosting (AdaBoost), extreme gradient boosting (XGBoost), decision tree (DT), logistic regression (LR), and random forest (RF), were employed to classify subjects into HC vs Polyp vs CRC, HC vs CRC, and Polyp vs CRC, respectively.ResultsThis study included 31 539 subjects: 11 793 HCs, 10 125 polyp patients, and 9621 CRC patients. The XGBoost model achieved the highest AUCs of 0.966 for differentiating HC from CRC and 0.881 for Polyp from CRC, outperforming carcino-embryonic antigen (CEA) and fecal occult blood testing (FOBT) tests. This model could also identify CEA-negative or FOBT-negative CRC patients. Incorporating stool miR-92a detection into the model further improved diagnostic performance. Shapley additive explanations (SHAP) plots indicated that FOBT, CEA, lymphocyte percentage (LYMPH%), and hematocrit (HCT) were the most significant features contributing to CRC diagnosis. Additionally, a computational tool for predicting CRC risk based on the optimal model was developed, designed for researchers with programming experience.ConclusionFive ML models for CRC diagnosis, based on ten routine laboratory test items, were developed, achieving higher diagnostic accuracies than traditional CRC biomarkers. The diagnostic capabilities of these ML models can be further enhanced by including stool miR-92a levels.

摘要

引言

结直肠癌(CRC)的早期诊断是一项重大的临床挑战。本研究旨在利用临床实验室数据开发用于CRC风险预测的机器学习(ML)模型。

方法

这项回顾性单中心研究分析了2013年至2023年间健康对照(HC)、息肉患者(Polyp)和CRC患者的实验室检查数据。采用了五种ML算法,包括自适应增强(AdaBoost)、极端梯度增强(XGBoost)、决策树(DT)、逻辑回归(LR)和随机森林(RF),分别将受试者分类为HC与Polyp与CRC、HC与CRC以及Polyp与CRC。

结果

本研究纳入了31539名受试者:11793名HC、10125名息肉患者和9621名CRC患者。XGBoost模型在区分HC与CRC方面的AUC最高,为0.966,在区分Polyp与CRC方面的AUC为0.881,优于癌胚抗原(CEA)和粪便潜血试验(FOBT)。该模型还可以识别CEA阴性或FOBT阴性的CRC患者。将粪便miR-92a检测纳入模型可进一步提高诊断性能。Shapley相加解释(SHAP)图表明,FOBT、CEA、淋巴细胞百分比(LYMPH%)和血细胞比容(HCT)是对CRC诊断贡献最大的显著特征。此外,还开发了一种基于最优模型预测CRC风险的计算工具,供有编程经验的研究人员使用。

结论

基于十项常规实验室检查项目开发了五种用于CRC诊断的ML模型,其诊断准确性高于传统的CRC生物标志物。通过纳入粪便miR-92a水平,这些ML模型的诊断能力可进一步提高。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9dd2/12062600/12c13e1b95f6/10.1177_10732748251336417-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9dd2/12062600/70215c98f5db/10.1177_10732748251336417-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9dd2/12062600/4fb1a07b663e/10.1177_10732748251336417-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9dd2/12062600/dc24d9bb1c64/10.1177_10732748251336417-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9dd2/12062600/2d30febcdcce/10.1177_10732748251336417-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9dd2/12062600/12c13e1b95f6/10.1177_10732748251336417-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9dd2/12062600/70215c98f5db/10.1177_10732748251336417-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9dd2/12062600/4fb1a07b663e/10.1177_10732748251336417-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9dd2/12062600/dc24d9bb1c64/10.1177_10732748251336417-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9dd2/12062600/2d30febcdcce/10.1177_10732748251336417-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9dd2/12062600/12c13e1b95f6/10.1177_10732748251336417-fig5.jpg

相似文献

1
Explainable Machine Learning Models for Colorectal Cancer Prediction Using Clinical Laboratory Data.使用临床实验室数据进行结直肠癌预测的可解释机器学习模型
Cancer Control. 2025 Jan-Dec;32:10732748251336417. doi: 10.1177/10732748251336417. Epub 2025 May 7.
2
Blood Biomarkers Panels for Screening of Colorectal Cancer and Adenoma on a Machine Learning-Assisted Detection Platform.基于机器学习辅助检测平台的用于结直肠癌和腺瘤筛查的血液生物标志物检测面板。
Cancer Control. 2023 Jan-Dec;30:10732748231222109. doi: 10.1177/10732748231222109.
3
Colorectal Cancer Detected by Machine Learning Models Using Conventional Laboratory Test Data.基于常规实验室检验数据的机器学习模型检测结直肠癌
Technol Cancer Res Treat. 2021 Jan-Dec;20:15330338211058352. doi: 10.1177/15330338211058352.
4
Enhancing the diagnostic accuracy of colorectal cancer through the integration of serum tumor markers and hematological indicators with machine learning algorithms.通过将血清肿瘤标志物和血液学指标与机器学习算法相结合来提高结直肠癌的诊断准确性。
Clin Transl Oncol. 2025 Jan;27(1):299-308. doi: 10.1007/s12094-024-03564-8. Epub 2024 Jun 20.
5
Construction and interpretation of weight-balanced enhanced machine learning models for predicting liver metastasis risk in colorectal cancer patients.用于预测结直肠癌患者肝转移风险的权重平衡增强机器学习模型的构建与解读
Discov Oncol. 2025 Feb 12;16(1):164. doi: 10.1007/s12672-025-01871-2.
6
Predicting Colorectal Cancer Survival Using Time-to-Event Machine Learning: Retrospective Cohort Study.基于生存事件的机器学习预测结直肠癌患者生存情况:回顾性队列研究。
J Med Internet Res. 2023 Oct 26;25:e44417. doi: 10.2196/44417.
7
Improved diagnosis of colorectal cancer using combined biomarkers including Fusobacterium nucleatum, fecal occult blood, transferrin, CEA, CA19-9, gender, and age.联合应用具核梭杆菌、粪便隐血、转铁蛋白、CEA、CA19-9、性别和年龄等标志物提高结直肠癌的诊断效能。
Cancer Med. 2023 Jul;12(13):14636-14645. doi: 10.1002/cam4.6067. Epub 2023 May 10.
8
Construction of a prognostic prediction model for colorectal cancer based on 5-year clinical follow-up data.基于5年临床随访数据构建结直肠癌预后预测模型
Sci Rep. 2025 Jan 21;15(1):2701. doi: 10.1038/s41598-025-86872-5.
9
Application value of the automated machine learning model based on modified CT index combined with serological indices in the early prediction of lung cancer.基于改良CT指标联合血清学指标的自动化机器学习模型在肺癌早期预测中的应用价值
Front Public Health. 2024 Apr 5;12:1368217. doi: 10.3389/fpubh.2024.1368217. eCollection 2024.
10
Prediction of STAS in lung adenocarcinoma with nodules ≤ 2 cm using machine learning: a multicenter retrospective study.使用机器学习预测直径≤2 cm的肺腺癌中的STAS:一项多中心回顾性研究
BMC Cancer. 2025 Mar 7;25(1):417. doi: 10.1186/s12885-025-13783-z.

引用本文的文献

1
A Novel Ensemble Framework for Comprehensive Early-Stage Colorectal Cancer Diagnosis, Prognosis, and Treatment: Integration of Gastroenterology-Specific Transformer Language Models and Multiple Decision Trees.一种用于早期结直肠癌综合诊断、预后和治疗的新型集成框架:胃肠病学特定变压器语言模型与多个决策树的整合
J Clin Med. 2025 Jun 23;14(13):4467. doi: 10.3390/jcm14134467.

本文引用的文献

1
Improving performance in colorectal cancer histology decomposition using deep and ensemble machine learning.使用深度和集成机器学习提高结直肠癌组织学分解的性能。
Heliyon. 2024 Sep 10;10(18):e37561. doi: 10.1016/j.heliyon.2024.e37561. eCollection 2024 Sep 30.
2
Novel hypoxia- and lactate metabolism-related molecular subtyping and prognostic signature for colorectal cancer.新型结直肠癌缺氧和乳酸代谢相关分子亚群分型及预后特征
J Transl Med. 2024 Jun 20;22(1):587. doi: 10.1186/s12967-024-05391-5.
3
Staging of colorectal cancer using lipid biomarkers and machine learning.
使用脂质生物标志物和机器学习对结直肠癌进行分期。
Metabolomics. 2023 Sep 20;19(10):84. doi: 10.1007/s11306-023-02049-z.
4
Estimated Lifetime Gained With Cancer Screening Tests: A Meta-Analysis of Randomized Clinical Trials.癌症筛查试验带来的预期寿命获益:一项随机临床试验的荟萃分析。
JAMA Intern Med. 2023 Nov 1;183(11):1196-1203. doi: 10.1001/jamainternmed.2023.3798.
5
Colon Cancer Screening Methods: 2023 Update.结肠癌筛查方法:2023年更新
Cureus. 2023 Apr 12;15(4):e37509. doi: 10.7759/cureus.37509. eCollection 2023 Apr.
6
Improved diagnosis of colorectal cancer using combined biomarkers including Fusobacterium nucleatum, fecal occult blood, transferrin, CEA, CA19-9, gender, and age.联合应用具核梭杆菌、粪便隐血、转铁蛋白、CEA、CA19-9、性别和年龄等标志物提高结直肠癌的诊断效能。
Cancer Med. 2023 Jul;12(13):14636-14645. doi: 10.1002/cam4.6067. Epub 2023 May 10.
7
Machine learning-based clinical decision support systems for pregnancy care: A systematic review.基于机器学习的妊娠护理临床决策支持系统:一项系统综述。
Int J Med Inform. 2023 May;173:105040. doi: 10.1016/j.ijmedinf.2023.105040. Epub 2023 Mar 8.
8
From patterns to patients: Advances in clinical machine learning for cancer diagnosis, prognosis, and treatment.从模式到患者:癌症诊断、预后和治疗的临床机器学习进展。
Cell. 2023 Apr 13;186(8):1772-1791. doi: 10.1016/j.cell.2023.01.035. Epub 2023 Mar 10.
9
Serrated polyposis syndrome; epidemiology and management.锯齿状息肉综合征;流行病学与管理。
Best Pract Res Clin Gastroenterol. 2022 Jun-Aug;58-59:101791. doi: 10.1016/j.bpg.2022.101791. Epub 2022 Mar 16.
10
Using Circulating Tumor DNA in Colorectal Cancer: Current and Evolving Practices.利用循环肿瘤 DNA 进行结直肠癌诊疗:现状与未来发展。
J Clin Oncol. 2022 Aug 20;40(24):2846-2857. doi: 10.1200/JCO.21.02615. Epub 2022 Jul 15.