• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于生物标志物的结直肠癌患者转移预测模型的开发与验证:机器学习算法的应用

Development and validation of a biomarker-based prediction model for metastasis in patients with colorectal cancer: Application of machine learning algorithms.

作者信息

Ayubi Erfan, Farashi Sajjad, Tapak Leili, Afshar Saeid

机构信息

Cancer Research Center, Institute of Cancer, Avicenna Health Research Institute, Hamadan University of Medical Sciences, Hamadan, Iran.

Neurophysiology Research Center, Institute of Neuroscience and Mental Health, Avicenna Health Research Institute, Hamadan University of Medical Sciences, Hamadan, Iran.

出版信息

Heliyon. 2024 Dec 24;11(1):e41443. doi: 10.1016/j.heliyon.2024.e41443. eCollection 2025 Jan 15.

DOI:10.1016/j.heliyon.2024.e41443
PMID:39839508
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11748706/
Abstract

OBJECTIVE

The purpose of the current study was to develop and validate a biomarker-based prediction model for metastasis in patients with colorectal cancer (CRC).

METHODS

Two datasets, GSE68468 and GSE41568, were retrieved from the Gene Expression Omnibus (GEO) database. In the GSE68468 dataset, key biomarkers were identified through a screening process involving differential expression analysis, redundancy analysis, and recursive feature elimination technique. Subsequently, the prediction model was developed and internally validated using five machine learning (ML) algorithms including lasso and elastic-net regularized generalized linear model (glmnet), k-nearest neighbors (kNN), support vector machine (SVM) with Radial Basis Function Kernel, random forest (RF), and eXtreme Gradient Boosting (XGBoost). The predictive performance of the algorithm with the highest accuracy was then externally validated on the GSE41568 dataset.

RESULTS

Among 22,283 registered genes in the GSE68468 dataset, the screening process identified 16 key genes including and these genes were used to build the prediction model. On the internal validation dataset, the prediction performance of five ML algorithms was as follows; RF (accuracy = 0.97 and kappa = 0.91), XGBoost (0.93, 0.81), kNN (0.93, 0.81), glmnet (0.93, 0.82) and SVM (0.92, 0.80). Top five biomarkers were and . The RF model exhibited an accuracy of 0.97, a kappa value of 0.92, and an area under the curve (AUC) of 0.99 in the external validation dataset.

CONCLUSION

The results of this study have identified biomarkers through ML algorithms which help to identify patients with CRC prone to metastasis.

摘要

目的

本研究旨在开发并验证一种基于生物标志物的预测模型,用于预测结直肠癌(CRC)患者的转移情况。

方法

从基因表达综合数据库(GEO)中检索了两个数据集,即GSE68468和GSE41568。在GSE68468数据集中,通过差异表达分析、冗余分析和递归特征消除技术等筛选过程确定关键生物标志物。随后,使用包括套索和弹性网络正则化广义线性模型(glmnet)、k近邻(kNN)、带径向基函数核的支持向量机(SVM)、随机森林(RF)和极端梯度提升(XGBoost)在内的五种机器学习(ML)算法开发预测模型并进行内部验证。然后在GSE41568数据集上对准确率最高的算法的预测性能进行外部验证。

结果

在GSE68468数据集中登记的22283个基因中,筛选过程确定了16个关键基因,包括 ,这些基因被用于构建预测模型。在内部验证数据集上,五种ML算法的预测性能如下:RF(准确率 = 0.97,kappa值 = 0.91)、XGBoost(0.93,0.81)、kNN(0.93,0.81)、glmnet(0.93,0.82)和SVM(0.92,0.80)。排名前五的生物标志物是 和 。在外部验证数据集中,RF模型的准确率为0.97,kappa值为0.92,曲线下面积(AUC)为0.99。

结论

本研究结果通过ML算法确定了生物标志物,有助于识别易发生转移的CRC患者。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf9a/11748706/73ca7fbaa52f/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf9a/11748706/c805c2f3bbeb/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf9a/11748706/0d7f928b98da/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf9a/11748706/73ca7fbaa52f/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf9a/11748706/c805c2f3bbeb/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf9a/11748706/0d7f928b98da/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf9a/11748706/73ca7fbaa52f/gr3.jpg

相似文献

1
Development and validation of a biomarker-based prediction model for metastasis in patients with colorectal cancer: Application of machine learning algorithms.基于生物标志物的结直肠癌患者转移预测模型的开发与验证:机器学习算法的应用
Heliyon. 2024 Dec 24;11(1):e41443. doi: 10.1016/j.heliyon.2024.e41443. eCollection 2025 Jan 15.
2
[Constructing a predictive model for the death risk of patients with septic shock based on supervised machine learning algorithms].基于监督机器学习算法构建脓毒症休克患者死亡风险预测模型
Zhonghua Wei Zhong Bing Ji Jiu Yi Xue. 2024 Apr;36(4):345-352. doi: 10.3760/cma.j.cn121430-20230930-00832.
3
Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms.使用机器学习算法开发并验证针对20岁及以上抑郁症患者冠心病风险的预测模型。
Front Cardiovasc Med. 2025 Jan 9;11:1504957. doi: 10.3389/fcvm.2024.1504957. eCollection 2024.
4
An External-Validated Prediction Model to Predict Lung Metastasis among Osteosarcoma: A Multicenter Analysis Based on Machine Learning.基于机器学习的骨肉瘤肺转移的外部验证预测模型:多中心分析。
Comput Intell Neurosci. 2022 May 6;2022:2220527. doi: 10.1155/2022/2220527. eCollection 2022.
5
Utilizing machine learning algorithms for predicting risk factors for bone metastasis from right-sided colon carcinoma after complete mesocolic excision: a 10-year retrospective multicenter study.利用机器学习算法预测完整结肠系膜切除术后右侧结肠癌骨转移的危险因素:一项10年回顾性多中心研究。
Discov Oncol. 2024 Sep 19;15(1):463. doi: 10.1007/s12672-024-01327-z.
6
Identification of biomarkers for knee osteoarthritis through clinical data and machine learning models.通过临床数据和机器学习模型识别膝关节骨关节炎的生物标志物
Sci Rep. 2025 Jan 11;15(1):1703. doi: 10.1038/s41598-025-85945-9.
7
Machine learning models to further identify advantaged populations that can achieve functional cure of chronic hepatitis B virus infection after receiving Peg-IFN alpha treatment.机器学习模型以进一步确定优势人群,这些人群在接受 Peg-IFNα 治疗后可以实现慢性乙型肝炎病毒感染的功能性治愈。
Int J Med Inform. 2025 Jan;193:105660. doi: 10.1016/j.ijmedinf.2024.105660. Epub 2024 Oct 22.
8
Machine learning assisted analysis of breast cancer gene expression profiles reveals novel potential prognostic biomarkers for triple-negative breast cancer.机器学习辅助分析乳腺癌基因表达谱揭示了三阴性乳腺癌新的潜在预后生物标志物。
Comput Struct Biotechnol J. 2022 Mar 24;20:1618-1631. doi: 10.1016/j.csbj.2022.03.019. eCollection 2022.
9
Enhanced non-invasive machine learning approach for early colorectal cancer detection: Predictive modeling and validation in a Jordanian cohort.用于早期结直肠癌检测的增强型非侵入性机器学习方法:约旦队列中的预测建模与验证
Comput Biol Med. 2025 Jun;191:110184. doi: 10.1016/j.compbiomed.2025.110184. Epub 2025 Apr 17.
10
Machine learning and deep learning methods that use omics data for metastasis prediction.利用组学数据进行转移预测的机器学习和深度学习方法。
Comput Struct Biotechnol J. 2021 Sep 4;19:5008-5018. doi: 10.1016/j.csbj.2021.09.001. eCollection 2021.

引用本文的文献

1
P53-Induced Autophagy Degradation of NKX3-2 Improves Ovarian Cancer Prognosis.p53诱导的NKX3-2自噬降解改善卵巢癌预后。
Cells. 2025 May 22;14(11):765. doi: 10.3390/cells14110765.
2
Optimizing prediction of metastasis among colorectal cancer patients using machine learning technology.使用机器学习技术优化结直肠癌患者转移的预测。
BMC Gastroenterol. 2025 Apr 18;25(1):272. doi: 10.1186/s12876-025-03841-y.

本文引用的文献

1
Using machine learning approach for screening metastatic biomarkers in colorectal cancer and predictive modeling with experimental validation.采用机器学习方法筛选结直肠癌转移标志物并进行实验验证的预测建模。
Sci Rep. 2023 Nov 8;13(1):19426. doi: 10.1038/s41598-023-46633-8.
2
The benefits and pitfalls of machine learning for biomarker discovery.机器学习在生物标志物发现中的优势和陷阱。
Cell Tissue Res. 2023 Oct;394(1):17-31. doi: 10.1007/s00441-023-03816-z. Epub 2023 Jul 27.
3
Machine learning for predicting survival of colorectal cancer patients.
机器学习预测结直肠癌患者的生存情况。
Sci Rep. 2023 Jun 1;13(1):8874. doi: 10.1038/s41598-023-35649-9.
4
Significant association of MCP1 rs1024611 and CCR2 rs1799864 polymorphisms with colorectal cancer and liver metastases susceptibility and aggressiveness: A case-control study.MCP1 rs1024611 和 CCR2 rs1799864 多态性与结直肠癌和肝转移易感性及侵袭性的显著相关性:一项病例对照研究。
Cytokine. 2023 Jul;167:156193. doi: 10.1016/j.cyto.2023.156193. Epub 2023 May 5.
5
A machine learning tool for identifying non-metastatic colorectal cancer in primary care.一种用于在初级保健中识别非转移性结直肠癌的机器学习工具。
Eur J Cancer. 2023 Mar;182:100-106. doi: 10.1016/j.ejca.2023.01.011. Epub 2023 Jan 20.
6
Identification of biomarkers predictive of metastasis development in early-stage colorectal cancer using network-based regularization.基于网络正则化的方法识别早期结直肠癌转移发展的生物标志物。
BMC Bioinformatics. 2023 Jan 16;24(1):17. doi: 10.1186/s12859-022-05104-z.
7
Global burden of colorectal cancer in 2020 and 2040: incidence and mortality estimates from GLOBOCAN.2020年和2040年全球结直肠癌负担:来自全球癌症负担(GLOBOCAN)的发病率和死亡率估计
Gut. 2023 Feb;72(2):338-344. doi: 10.1136/gutjnl-2022-327736. Epub 2022 Sep 8.
8
Role of EFNA1 SNP (rs12904) in Tumorigenesis and Metastasis of Colorectal Cancer: A Bioinformatic Analysis and HRM SNP Genotyping Verification.EFNA1 SNP(rs12904) 在结直肠癌发生和转移中的作用:生物信息学分析和 HRM SNP 基因分型验证。
Asian Pac J Cancer Prev. 2022 Oct 1;23(10):3523-3531. doi: 10.31557/APJCP.2022.23.10.3523.
9
MMP1 Overexpression Promotes Cancer Progression and Associates with Poor Outcome in Head and Neck Carcinoma.MMP1 过表达促进头颈部癌的进展并与不良预后相关。
Comput Math Methods Med. 2022 Sep 5;2022:3058342. doi: 10.1155/2022/3058342. eCollection 2022.
10
Stabilization of CCDC102B by Loss of RACK1 Through the CMA Pathway Promotes Breast Cancer Metastasis Activation of the NF-κB Pathway.通过CMA途径缺失RACK1使CCDC102B稳定,促进乳腺癌转移并激活NF-κB途径。
Front Oncol. 2022 Jul 25;12:927358. doi: 10.3389/fonc.2022.927358. eCollection 2022.