• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

临床数据预测模型,用于识别早期胰腺癌患者。

Clinical Data Prediction Model to Identify Patients With Early-Stage Pancreatic Cancer.

机构信息

Department of Radiation Medicine and Applied Sciences, University of California San Diego, La Jolla, CA.

School of Medicine, University of California San Diego, La Jolla, CA.

出版信息

JCO Clin Cancer Inform. 2021 Mar;5:279-287. doi: 10.1200/CCI.20.00137.

DOI:10.1200/CCI.20.00137
PMID:33739856
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8462624/
Abstract

PURPOSE

Pancreatic cancer is an aggressive malignancy with patients often experiencing nonspecific symptoms before diagnosis. This study evaluates a machine learning approach to help identify patients with early-stage pancreatic cancer from clinical data within electronic health records (EHRs).

MATERIALS AND METHODS

From the Optum deidentified EHR data set, we identified early-stage (n = 3,322) and late-stage (n = 25,908) pancreatic cancer cases over 40 years of age diagnosed between 2009 and 2017. Patients with early-stage pancreatic cancer were matched to noncancer controls (1:16 match). We constructed a prediction model using eXtreme Gradient Boosting (XGBoost) to identify early-stage patients on the basis of 18,220 features within the EHR including diagnoses, procedures, information within clinical notes, and medications. Model accuracy was assessed with sensitivity, specificity, positive predictive value, and the area under the curve.

RESULTS

The final predictive model included 582 predictive features from the EHR, including 248 (42.5%) physician note elements, 146 (25.0%) procedure codes, 91 (15.6%) diagnosis codes, 89 (15.3%) medications, and 9 (1.5%) demographic features. The final model area under the curve was 0.84. Choosing a model cut point with a sensitivity of 60% and specificity of 90% would enable early detection of 58% late-stage patients with a median of 24 months before their actual diagnosis.

CONCLUSION

Prediction models using EHR data show promise in the early detection of pancreatic cancer. Although widespread use of this approach on an unselected population would produce high rates of false-positive tests, this technique may be rapidly impactful if deployed among high-risk patients or paired with other imaging or biomarker screening tools.

摘要

目的

胰腺癌是一种侵袭性恶性肿瘤,患者在诊断前常出现非特异性症状。本研究评估了一种机器学习方法,以帮助从电子健康记录(EHR)中的临床数据中识别出早期胰腺癌患者。

材料和方法

从 Optum 去识别 EHR 数据集,我们确定了 40 岁以上在 2009 年至 2017 年期间诊断为早期(n = 3322)和晚期(n = 25908)胰腺癌的病例。将早期胰腺癌患者与非癌症对照组(1:16 匹配)进行匹配。我们使用极端梯度提升(XGBoost)构建了一个预测模型,根据 EHR 中的 18220 个特征(包括诊断、程序、临床记录中的信息和药物)来识别早期患者。使用敏感性、特异性、阳性预测值和曲线下面积评估模型准确性。

结果

最终的预测模型包括来自 EHR 的 582 个预测特征,包括 248 个(42.5%)医生笔记元素、146 个(25.0%)程序代码、91 个(15.6%)诊断代码、89 个(15.3%)药物和 9 个(1.5%)人口统计学特征。最终模型的曲线下面积为 0.84。选择一个灵敏度为 60%、特异性为 90%的模型切点,可以在实际诊断前中位数为 24 个月时提前发现 58%的晚期患者。

结论

使用 EHR 数据的预测模型在胰腺癌的早期检测方面显示出了前景。虽然在未选择的人群中广泛使用这种方法会产生高假阳性测试率,但如果在高危患者中部署或与其他成像或生物标志物筛查工具结合使用,这种技术可能会迅速产生影响。

相似文献

1
Clinical Data Prediction Model to Identify Patients With Early-Stage Pancreatic Cancer.临床数据预测模型,用于识别早期胰腺癌患者。
JCO Clin Cancer Inform. 2021 Mar;5:279-287. doi: 10.1200/CCI.20.00137.
2
Early Diagnosis of Pancreatic Cancer via Machine Learning Analysis of a National Electronic Medical Record Database.基于国家电子病历数据库的机器学习分析实现胰腺癌的早期诊断。
JCO Clin Cancer Inform. 2023 Sep;7:e2300076. doi: 10.1200/CCI.23.00076.
3
Can we screen for pancreatic cancer? Identifying a sub-population of patients at high risk of subsequent diagnosis using machine learning techniques applied to primary care data.我们能否对胰腺癌进行筛查?利用机器学习技术对初级保健数据进行分析,确定后续诊断中高危患者的亚人群。
PLoS One. 2021 Jun 2;16(6):e0251876. doi: 10.1371/journal.pone.0251876. eCollection 2021.
4
Pancreatic cancer symptom trajectories from Danish registry data and free text in electronic health records.从丹麦登记数据和电子健康记录中的自由文本中提取的胰腺癌症状轨迹。
Elife. 2023 Nov 21;12:e84919. doi: 10.7554/eLife.84919.
5
Deep learning on time series laboratory test results from electronic health records for early detection of pancreatic cancer.基于电子健康记录的时间序列实验室检测结果的深度学习在胰腺癌早期检测中的应用。
J Biomed Inform. 2022 Jul;131:104095. doi: 10.1016/j.jbi.2022.104095. Epub 2022 May 20.
6
Early Detection of Pancreatic Cancer: Applying Artificial Intelligence to Electronic Health Records.早期胰腺癌检测:将人工智能应用于电子健康记录。
Pancreas. 2021 Aug 1;50(7):916-922. doi: 10.1097/MPA.0000000000001882.
7
Performance of a Machine Learning Algorithm Using Electronic Health Record Data to Identify and Estimate Survival in a Longitudinal Cohort of Patients With Lung Cancer.基于电子健康记录数据的机器学习算法在肺癌纵向队列患者中识别和估计生存的性能。
JAMA Netw Open. 2021 Jul 1;4(7):e2114723. doi: 10.1001/jamanetworkopen.2021.14723.
8
Prediction of the 1-Year Risk of Incident Lung Cancer: Prospective Study Using Electronic Health Records from the State of Maine.预测肺癌发病的1年风险:使用缅因州电子健康记录的前瞻性研究
J Med Internet Res. 2019 May 16;21(5):e13260. doi: 10.2196/13260.
9
Predicting post-stroke pneumonia using deep neural network approaches.使用深度神经网络方法预测卒中后肺炎。
Int J Med Inform. 2019 Dec;132:103986. doi: 10.1016/j.ijmedinf.2019.103986. Epub 2019 Oct 1.
10
Machine Learning Predicts Patients With New-onset Diabetes at Risk of Pancreatic Cancer.机器学习可预测新发糖尿病患者患胰腺癌的风险。
J Clin Gastroenterol. 2024 Aug 1;58(7):681-691. doi: 10.1097/MCG.0000000000001897.

引用本文的文献

1
Artificial intelligence model for predicting early biochemical recurrence of prostate cancer after robotic-assisted radical prostatectomy.预测机器人辅助根治性前列腺切除术后前列腺癌早期生化复发的人工智能模型
Sci Rep. 2025 Aug 21;15(1):30822. doi: 10.1038/s41598-025-16362-1.
2
Diagnostic Risk Prediction Models for Upper Gastrointestinal Cancers: A Systematic Review.上消化道癌症的诊断风险预测模型:一项系统评价
Cancer Epidemiol Biomarkers Prev. 2025 Aug 1;34(8):1240-1251. doi: 10.1158/1055-9965.EPI-24-1714.
3
Advancing Pancreatic Cancer Prediction with a Next Visit Token Prediction Head on Top of Med-BERT.基于Med-BERT的下一次就诊标记预测头推进胰腺癌预测。
Cancers (Basel). 2025 Feb 4;17(3):516. doi: 10.3390/cancers17030516.
4
Exploring Artificial Intelligence Biases in Predictive Models for Cancer Diagnosis.探索癌症诊断预测模型中的人工智能偏差。
Cancers (Basel). 2025 Jan 26;17(3):407. doi: 10.3390/cancers17030407.
5
Electronic Health Records-based identification of newly diagnosed Crohn's Disease cases.基于电子健康记录识别新诊断的克罗恩病病例。
Artif Intell Med. 2025 Jan;159:103032. doi: 10.1016/j.artmed.2024.103032. Epub 2024 Nov 21.
6
Establishment of prediction model for mortality risk of pancreatic cancer: a retrospective study.建立胰腺癌死亡风险预测模型:一项回顾性研究。
BMC Med Inform Decis Mak. 2024 Jun 27;24(1):181. doi: 10.1186/s12911-024-02590-4.
7
Guiding post-pancreaticoduodenectomy interventions for pancreatic cancer patients utilizing decision tree models.利用决策树模型指导胰腺癌患者胰十二指肠切除术后的干预措施。
Front Oncol. 2024 May 30;14:1399297. doi: 10.3389/fonc.2024.1399297. eCollection 2024.
8
Machine Learning Models for Pancreatic Cancer Risk Prediction Using Electronic Health Record Data-A Systematic Review and Assessment.基于电子健康记录数据的胰腺癌风险预测机器学习模型:系统评价与评估。
Am J Gastroenterol. 2024 Aug 1;119(8):1466-1482. doi: 10.14309/ajg.0000000000002870. Epub 2024 May 16.
9
Identification of pancreatic cancer risk factors from clinical notes using natural language processing.利用自然语言处理从临床记录中识别胰腺癌风险因素。
Pancreatology. 2024 Jun;24(4):572-578. doi: 10.1016/j.pan.2024.03.016. Epub 2024 Mar 26.
10
Advancements in Pancreatic Cancer Detection: Integrating Biomarkers, Imaging Technologies, and Machine Learning for Early Diagnosis.胰腺癌检测的进展:整合生物标志物、成像技术和机器学习以实现早期诊断。
Cureus. 2024 Mar 20;16(3):e56583. doi: 10.7759/cureus.56583. eCollection 2024 Mar.

本文引用的文献

1
Pancreatic Cancer Prediction Through an Artificial Neural Network.通过人工神经网络进行胰腺癌预测
Front Artif Intell. 2019 May 3;2:2. doi: 10.3389/frai.2019.00002. eCollection 2019.
2
Genetic and Circulating Biomarker Data Improve Risk Prediction for Pancreatic Cancer in the General Population.遗传和循环生物标志物数据可改善一般人群中胰腺癌的风险预测。
Cancer Epidemiol Biomarkers Prev. 2020 May;29(5):999-1008. doi: 10.1158/1055-9965.EPI-19-1389. Epub 2020 Apr 22.
3
Applications of Machine Learning Predictive Models in the Chronic Disease Diagnosis.机器学习预测模型在慢性病诊断中的应用。
J Pers Med. 2020 Mar 31;10(2):21. doi: 10.3390/jpm10020021.
4
Artificial intelligence with multi-functional machine learning platform development for better healthcare and precision medicine.开发具有多功能机器学习平台的人工智能,以实现更优质的医疗保健和精准医疗。
Database (Oxford). 2020 Jan 1;2020. doi: 10.1093/database/baaa010.
5
Machine Learning-Based Prediction Models for 30-Day Readmission after Hospitalization for Chronic Obstructive Pulmonary Disease.基于机器学习的慢性阻塞性肺疾病住院后 30 天再入院预测模型。
COPD. 2019 Dec;16(5-6):338-343. doi: 10.1080/15412555.2019.1688278. Epub 2019 Nov 11.
6
Accuracy of an administrative database for pancreatic cancer by international classification of disease 10 codes: A retrospective large-cohort study.基于国际疾病分类第 10 版代码的胰腺癌行政数据库的准确性:一项回顾性大样本研究。
World J Gastroenterol. 2019 Oct 7;25(37):5619-5629. doi: 10.3748/wjg.v25.i37.5619.
7
Screening for Pancreatic Cancer: US Preventive Services Task Force Reaffirmation Recommendation Statement.筛查胰腺癌:美国预防服务工作组重新确认推荐声明。
JAMA. 2019 Aug 6;322(5):438-444. doi: 10.1001/jama.2019.10232.
8
Screening for Pancreatic Cancer: Updated Evidence Report and Systematic Review for the US Preventive Services Task Force.胰腺癌筛查:美国预防服务工作组的更新证据报告和系统评价。
JAMA. 2019 Aug 6;322(5):445-454. doi: 10.1001/jama.2019.6190.
9
The potential for artificial intelligence in healthcare.人工智能在医疗保健领域的潜力。
Future Healthc J. 2019 Jun;6(2):94-98. doi: 10.7861/futurehosp.6-2-94.
10
Challenges of Using ICD-9-CM and ICD-10-CM Codes for Soft-Tissue Sarcoma in Databases for Health Services Research.在卫生服务研究数据库中使用ICD - 9 - CM和ICD - 10 - CM编码对软组织肉瘤进行编码的挑战。
Perspect Health Inf Manag. 2019 Apr 1;16(Spring):1a. eCollection 2019 Spring.