基于电子健康记录的时间序列实验室检测结果的深度学习在胰腺癌早期检测中的应用。

Deep learning on time series laboratory test results from electronic health records for early detection of pancreatic cancer.

机构信息

Department of Medicine, Columbia University Irving Medical Center, New York, NY, United States.

Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, New York, NY, United States.

出版信息

J Biomed Inform. 2022 Jul;131:104095. doi: 10.1016/j.jbi.2022.104095. Epub 2022 May 20.

DOI:10.1016/j.jbi.2022.104095

PMID:35598881

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10286873/

Abstract

The multi-modal and unstructured nature of observational data in Electronic Health Records (EHR) is currently a significant obstacle for the application of machine learning towards risk stratification. In this study, we develop a deep learning framework for incorporating longitudinal clinical data from EHR to infer risk for pancreatic cancer (PC). This framework includes a novel training protocol, which enforces an emphasis on early detection by applying an independent Poisson-random mask on proximal-time measurements for each variable. Data fusion for irregular multivariate time-series features is enabled by a "grouped" neural network (GrpNN) architecture, which uses representation learning to generate a dimensionally reduced vector for each measurement set before making a final prediction. These models were evaluated using EHR data from Columbia University Irving Medical Center-New York Presbyterian Hospital. Our framework demonstrated better performance on early detection (AUROC 0.671, CI 95% 0.667 - 0.675, p < 0.001) at 12 months prior to diagnosis compared to a logistic regression, xgboost, and a feedforward neural network baseline. We demonstrate that our masking strategy results greater improvements at distal times prior to diagnosis, and that our GrpNN model improves generalizability by reducing overfitting relative to the feedforward baseline. The results were consistent across reported race. Our proposed algorithm is potentially generalizable to other diseases including but not limited to cancer where early detection can improve survival.

摘要

电子健康记录 (EHR) 中的观察数据具有多模态和非结构化的特点，这目前是机器学习在风险分层应用中的一个重大障碍。在本研究中，我们开发了一个深度学习框架，用于将 EHR 中的纵向临床数据纳入其中，以推断胰腺癌 (PC) 的风险。该框架包括一个新颖的训练方案，通过对每个变量的近端时间测量值应用独立的泊松随机掩码，强制强调早期检测。通过“分组”神经网络 (GrpNN) 架构实现了不规则多变量时间序列特征的数据融合，该架构使用表示学习在进行最终预测之前为每个测量集生成一个降维向量。这些模型使用来自哥伦比亚大学欧文医学中心-纽约长老会医院的 EHR 数据进行了评估。与逻辑回归、xgboost 和前馈神经网络基线相比，我们的框架在诊断前 12 个月的早期检测（AUROC 0.671，95%CI 0.667-0.675，p<0.001）方面表现出更好的性能。我们证明，我们的掩蔽策略在诊断前的远端时间上产生了更大的改进，并且我们的 GrpNN 模型通过减少相对于前馈基线的过拟合来提高了通用性。结果在报告的种族之间是一致的。我们提出的算法可能具有普遍性，可以应用于其他疾病，包括但不限于癌症，早期检测可以提高生存率。

相似文献

Deep learning on time series laboratory test results from electronic health records for early detection of pancreatic cancer.

J Biomed Inform. 2022 Jul;131:104095. doi: 10.1016/j.jbi.2022.104095. Epub 2022 May 20.

COVID-19 Mortality Prediction From Deep Learning in a Large Multistate Electronic Health Record and Laboratory Information System Data Set: Algorithm Development and Validation.

J Med Internet Res. 2021 Sep 28;23(9):e30157. doi: 10.2196/30157.

Identification of patients at risk for pancreatic cancer in a 3-year timeframe based on machine learning algorithms.

Sci Rep. 2025 Apr 5;15(1):11697. doi: 10.1038/s41598-025-89607-8.

Combining structured and unstructured data for predictive models: a deep learning approach.

BMC Med Inform Decis Mak. 2020 Oct 29;20(1):280. doi: 10.1186/s12911-020-01297-6.

Machine Learning Models for Pancreatic Cancer Risk Prediction Using Electronic Health Record Data-A Systematic Review and Assessment.

Am J Gastroenterol. 2024 Aug 1;119(8):1466-1482. doi: 10.14309/ajg.0000000000002870. Epub 2024 May 16.

The Use of Deep Learning and Machine Learning on Longitudinal Electronic Health Records for the Early Detection and Prevention of Diseases: Scoping Review.

J Med Internet Res. 2024 Aug 20;26:e48320. doi: 10.2196/48320.

Multimodal Artificial Intelligence Models Predicting Glaucoma Progression Using Electronic Health Records and Retinal Nerve Fiber Layer Scans.

Transl Vis Sci Technol. 2025 Mar 3;14(3):27. doi: 10.1167/tvst.14.3.27.

Applying interpretable deep learning models to identify chronic cough patients using EHR data.

Comput Methods Programs Biomed. 2021 Oct;210:106395. doi: 10.1016/j.cmpb.2021.106395. Epub 2021 Sep 4.

Development and Validation of a Deep Learning Model for Earlier Detection of Cognitive Decline From Clinical Notes in Electronic Health Records.

JAMA Netw Open. 2021 Nov 1;4(11):e2135174. doi: 10.1001/jamanetworkopen.2021.35174.

Phenotype risk scores (PheRS) for pancreatic cancer using time-stamped electronic health record data: Discovery and validation in two large biobanks.

J Biomed Inform. 2021 Jan;113:103652. doi: 10.1016/j.jbi.2020.103652. Epub 2020 Dec 3.

引用本文的文献

Diagnosis methods for pancreatic cancer with the technique of deep learning: a review and a meta-analysis.

Front Oncol. 2025 Aug 20;15:1597969. doi: 10.3389/fonc.2025.1597969. eCollection 2025.

Diagnostic Risk Prediction Models for Upper Gastrointestinal Cancers: A Systematic Review.

Cancer Epidemiol Biomarkers Prev. 2025 Aug 1;34(8):1240-1251. doi: 10.1158/1055-9965.EPI-24-1714.

Advancing Precision Oncology Through Modeling of Longitudinal and Multimodal Data.

ArXiv. 2025 Apr 29:arXiv:2502.07836v2.

Artificial intelligence methods applied to longitudinal data from electronic health records for prediction of cancer: a scoping review.

BMC Med Res Methodol. 2025 Jan 28;25(1):24. doi: 10.1186/s12874-025-02473-w.

Artificial intelligence: clinical applications and future advancement in gastrointestinal cancers.

Front Artif Intell. 2024 Dec 20;7:1446693. doi: 10.3389/frai.2024.1446693. eCollection 2024.

Machine Learning Models for Pancreatic Cancer Risk Prediction Using Electronic Health Record Data-A Systematic Review and Assessment.

Am J Gastroenterol. 2024 Aug 1;119(8):1466-1482. doi: 10.14309/ajg.0000000000002870. Epub 2024 May 16.

Research on multi-robot collaborative operation in logistics and warehousing using A3C optimized YOLOv5-PPO model.

Front Neurorobot. 2024 Jan 23;17:1329589. doi: 10.3389/fnbot.2023.1329589. eCollection 2023.

Improved accuracy and efficiency of primary care fall risk screening of older adults using a machine learning approach.

J Am Geriatr Soc. 2024 Apr;72(4):1145-1154. doi: 10.1111/jgs.18776. Epub 2024 Jan 13.

Diagnostic ability of deep learning in detection of pancreatic tumour.

Sci Rep. 2023 Jun 15;13(1):9725. doi: 10.1038/s41598-023-36886-8.

A Review on Electronic Health Record Text-Mining for Biomedical Name Entity Recognition in Healthcare Domain.

Healthcare (Basel). 2023 Apr 28;11(9):1268. doi: 10.3390/healthcare11091268.

本文引用的文献

Early Detection of Pancreatic Cancer: Applying Artificial Intelligence to Electronic Health Records.

Pancreas. 2021 Aug 1;50(7):916-922. doi: 10.1097/MPA.0000000000001882.

Deep Learning Improves Pancreatic Cancer Diagnosis Using RNA-Based Variants.

Cancers (Basel). 2021 May 28;13(11):2654. doi: 10.3390/cancers13112654.

Improving the portability of predicting students' performance models by using ontologies.

J Comput High Educ. 2022;34(1):1-19. doi: 10.1007/s12528-021-09273-3. Epub 2021 Mar 24.

Pancreatic Cancer Prediction Through an Artificial Neural Network.

Front Artif Intell. 2019 May 3;2:2. doi: 10.3389/frai.2019.00002. eCollection 2019.

Genetic/Familial High-Risk Assessment: Breast, Ovarian, and Pancreatic, Version 2.2021, NCCN Clinical Practice Guidelines in Oncology.

J Natl Compr Canc Netw. 2021 Jan 6;19(1):77-102. doi: 10.6004/jnccn.2021.0001.

Development and validation of a pancreatic cancer risk model for the general population using electronic health records: An observational study.

Eur J Cancer. 2021 Jan;143:19-30. doi: 10.1016/j.ejca.2020.10.019. Epub 2020 Dec 2.

Fostering reproducibility and generalizability in machine learning for clinical prediction modeling in spine surgery.

Spine J. 2021 Oct;21(10):1610-1616. doi: 10.1016/j.spinee.2020.10.006. Epub 2020 Oct 13.

Deep learning analysis for the detection of pancreatic cancer on endosonographic images: a pilot study.

J Hepatobiliary Pancreat Sci. 2021 Jan;28(1):95-104. doi: 10.1002/jhbp.825. Epub 2020 Oct 15.

Management of patients with increased risk for familial pancreatic cancer: updated recommendations from the International Cancer of the Pancreas Screening (CAPS) Consortium.

Gut. 2020 Jan;69(1):7-17. doi: 10.1136/gutjnl-2019-319352. Epub 2019 Oct 31.

Evaluating Susceptibility to Pancreatic Cancer: ASCO Provisional Clinical Opinion.

J Clin Oncol. 2019 Jan 10;37(2):153-164. doi: 10.1200/JCO.18.01489. Epub 2018 Nov 20.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于电子健康记录的时间序列实验室检测结果的深度学习在胰腺癌早期检测中的应用。

Deep learning on time series laboratory test results from electronic health records for early detection of pancreatic cancer.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献