文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

结肠腺癌远处转移(DM)的危险因素及预测:一项基于监测、流行病学和最终结果(SEER)数据库的逻辑回归与机器学习研究

Risk factors and prediction of distant metastasis (DM) of colon adenocarcinoma: a logistic regression and machine learning study based on surveillance, epidemiology, and end results (SEER) database.

作者信息

Guo Qiang, Li Junyun, Wei Zhe, Xu Jingjing, Duan Shaojun, Li Jianfeng, Liu Yaxi

机构信息

Team of Clinical Pharmacy, Department of Pharmacy, Jincheng People's Hospital, Jincheng City, People's Republic of China.

Department of General Surgery, Jincheng People's Hospital, Jincheng City, People's Republic of China.

出版信息

BMC Cancer. 2025 Jul 1;25(1):1047. doi: 10.1186/s12885-025-14329-z.


DOI:10.1186/s12885-025-14329-z
PMID:40597951
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12211135/
Abstract

BACKGROUND: Given the limitations of traditional imaging examinations to detect distant metastasis (DM) (e.g., low sensitivity), this study is to identify pathological and laboratory risk factors and establish models predicting distant metastasis of colon adenocarcinoma (CA) patients. METHODS: CA Patients diagnosed between the year of 2018 and 2021 were retrieved from SEER. Logistic regression was utilized to find independent risk factors (IRFs) of DM and 12 models including BNB (Bernoulli naïve bayes), DT (Decision tree), GBC (Gradient Boosting Classifier), GNB (Gaussian naïve bayes), KNN (K-nearest neighbor), LDA (Linear Discriminant Analysis), LR (Logistic regression), MLP (Multi-layer perceptron classifier), MNB (Multinomial naïve bayes), QDA (Quadratic discriminant analysis), RFC (Random forest classifier) and SVC (Support vector machine) were established and evaluated on the training set and test set (7:3) of the retrieved patients. Additionally, CA patient data was collected from Jincheng People’s Hospital (JCPH) as an external validation set for the prediction efficacy of the models. RESULTS: 7,000 and 83 CA patients were retrieved from SEER and JCPH respectively, and 8 IRFs including age 60–79 (OR = 0.589, 95% CI: 0.391–0.887) and age > 80 (OR = 0.456, 95% CI: 0.287–0.722), primary site – cecum (OR = 1.305, 95% CI: 1.023–1.664), TNM stage – T3 (OR = 8.869, 95% CI: 2.151–36.569) and T4 (OR = 15.912, 95% CI: 3.839–65.955), TNM stage – N1 (OR = 3.853, 95% CI: 2.919–5.087) and N2 (OR = 8.480, 95% CI: 6.322–11.374), number of regional nodes examined > 12 (OR = 0.439, 95% CI: 0.326–0.591), tumor deposits (OR = 1.989, 95% CI: 1.639–2.414), carcinoembryonic antigen (CEA) level (OR = 4.552, 95% CI: 3.747–5.530) and perineural invasion (OR = 1.352, 95% CI: 1.112–1.643) were identified. LR showed the best predictive efficacy both on the test (AUC = 0.892, sensitivity = 0.825, specificity = 0.801) and external validation set (AUC = 0.868, sensitivity = 1.000, specificity = 0.727). CONCLUSIONS: Machine learning is a promising way to assist the detection of DM for CA patients.

摘要

背景:鉴于传统影像学检查在检测远处转移(DM)方面存在局限性(如敏感性低),本研究旨在确定病理和实验室风险因素,并建立预测结肠腺癌(CA)患者远处转移的模型。 方法:从监测、流行病学与最终结果(SEER)数据库中检索2018年至2021年期间诊断的CA患者。采用逻辑回归分析寻找DM的独立危险因素(IRF),并建立包括伯努利朴素贝叶斯(BNB)、决策树(DT)、梯度提升分类器(GBC)、高斯朴素贝叶斯(GNB)、K近邻(KNN)、线性判别分析(LDA)、逻辑回归(LR)、多层感知器分类器(MLP)、多项式朴素贝叶斯(MNB)、二次判别分析(QDA)、随机森林分类器(RFC)和支持向量机(SVC)在内的12种模型,并在检索患者的训练集和测试集(7:3)上进行评估。此外,收集来自晋城人民医院(JCPH)的CA患者数据作为模型预测效能的外部验证集。 结果:分别从SEER和JCPH数据库中检索到7000例和83例CA患者,确定了8个IRF,包括年龄60 - 79岁(OR = 0.589,95%CI:0.391 - 0.887)和年龄>80岁(OR = 0.456,95%CI:0.287 - 0.722)、原发部位 - 盲肠(OR = 1.305,95%CI:1.023 - 1.664)、TNM分期 - T3(OR = 8.869,95%CI:2.151 - 36.569)和T4(OR = 15.912,95%CI:3.839 - 65.955)、TNM分期 - N1(OR = 3.85环,95%CI:2.919 - 5.087)和N2(OR = 8.480,95%CI:6.322 - 11.374)、检查的区域淋巴结数量>12个(OR = 0.439,95%CI:0.326 - 0.591)、肿瘤沉积(OR = 1.989,95%CI:1.639 - 2.414)、癌胚抗原(CEA)水平(OR = 4.552,95%CI:3.747 - 5.530)和神经周围侵犯(OR = 1.352,95%CI:1.112 - 1.643)。LR在测试集(AUC = 0.892,敏感性 = 0.825,特异性 = 0.801)和外部验证集(AUC = 0.868,敏感性 = 1.000,特异性 = 0.727)上均显示出最佳预测效能。 结论:机器学习是辅助检测CA患者DM的一种有前景的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8f7/12211135/ed45a01a8d65/12885_2025_14329_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8f7/12211135/7e4eb325bde2/12885_2025_14329_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8f7/12211135/ed45a01a8d65/12885_2025_14329_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8f7/12211135/7e4eb325bde2/12885_2025_14329_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8f7/12211135/ed45a01a8d65/12885_2025_14329_Fig2_HTML.jpg

相似文献

[1]
Risk factors and prediction of distant metastasis (DM) of colon adenocarcinoma: a logistic regression and machine learning study based on surveillance, epidemiology, and end results (SEER) database.

BMC Cancer. 2025-7-1

[2]
An explainable machine learning model for predicting the risk of distant metastasis in intrahepatic cholangiocarcinoma: a population-based cohort study.

Discov Oncol. 2025-6-18

[3]
Which Types of Patients With Extensive-Stage Small Cell Lung Cancer Benefit From Radiotherapy? A Retrospective Study Integrating Machine Learning With the SEER Database and a Chinese Cohort.

Cancer Control. 2025

[4]
Machine learning to predict distant metastasis and prognostic analysis of moderately differentiated gastric adenocarcinoma patients: a novel focus on lymph node indicators.

Front Immunol. 2024

[5]
Establishment of a prognostic nomogram and risk stratification system for patients with distant-metastatic hepatocellular carcinoma: A population-based study.

Medicine (Baltimore). 2025-6-13

[6]
Development and validation of machine learning models for distant metastasis of primary hepatic carcinoma: a population-based study.

Discov Oncol. 2025-6-16

[7]
Individual risk and prognostic value prediction by interpretable machine learning for distant metastasis in neuroblastoma: A population-based study and an external validation.

Int J Med Inform. 2025-4

[8]
Risk Assessment of Bone Metastasis for Cervical Cancer Patients by Multiple Models: A Large Population Based Real-World Study.

Front Med (Lausanne). 2021-10-5

[9]
Machine learning based on SEER database to predict distant metastasis of thyroid cancer.

Endocrine. 2024-6

[10]
OncoE25: an AI model for predicting postoperative prognosis in early-onset stage I-III colon and rectal cancer-a population-based study using SEER with dual-center cohort validation.

J Transl Med. 2025-6-22

本文引用的文献

[1]
Machine learning to predict distant metastasis and prognostic analysis of moderately differentiated gastric adenocarcinoma patients: a novel focus on lymph node indicators.

Front Immunol. 2024

[2]
Survival prediction in sigmoid colon cancer patients with liver metastasis: a prospective cohort study.

JNCI Cancer Spectr. 2024-9-2

[3]
Evaluating the prognostic value of tumor deposits in non-metastatic lymph node-positive colon adenocarcinoma using Cox regression and machine learning.

Int J Colorectal Dis. 2024-6-26

[4]
Survival outcome and prognostic factors for early-onset and late-onset metastatic colorectal cancer: a population based study from SEER database.

Sci Rep. 2024-2-22

[5]
Cancer statistics, 2024.

CA Cancer J Clin. 2024

[6]
Annual report to the nation on the status of cancer, part 1: National cancer statistics.

Cancer. 2022-12-15

[7]
Adjuvant Therapy for Stage II Colon Cancer: ASCO Guideline Update.

J Clin Oncol. 2022-3-10

[8]
Localised colon cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up.

Ann Oncol. 2020-10

[9]
Post-Operative Infection Prediction and Risk Factor Analysis in Colorectal Surgery Using Data Mining Techniques: A Pilot Study.

Surg Infect (Larchmt). 2020-11

[10]
Differences between carcinoma of the cecum and ascending colon: Evidence based on clinical and embryological data.

Int J Oncol. 2018-4-12

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索