• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过深度学习利用结构化电子健康记录数据中的顺序诊断代码增强患者预后预测:系统评价

Enhancing Patient Outcome Prediction Through Deep Learning With Sequential Diagnosis Codes From Structured Electronic Health Record Data: Systematic Review.

作者信息

Hama Tuankasfee, Alsaleh Mohanad M, Allery Freya, Choi Jung Won, Tomlinson Christopher, Wu Honghan, Lai Alvina, Pontikos Nikolas, Thygesen Johan H

机构信息

Institute of Health Informatics, University College London, London, United Kingdom.

Department of Health Informatics, College of Applied Medical Sciences, Qassim University, Buraydah, Saudi Arabia.

出版信息

J Med Internet Res. 2025 Mar 18;27:e57358. doi: 10.2196/57358.

DOI:10.2196/57358
PMID:40100249
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11962322/
Abstract

BACKGROUND

The use of structured electronic health records in health care systems has grown rapidly. These systems collect huge amounts of patient information, including diagnosis codes representing temporal medical history. Sequential diagnostic information has proven valuable for predicting patient outcomes. However, the extent to which these types of data have been incorporated into deep learning (DL) models has not been examined.

OBJECTIVE

This systematic review aims to describe the use of sequential diagnostic data in DL models, specifically to understand how these data are integrated, whether sample size improves performance, and whether the identified models are generalizable.

METHODS

Relevant studies published up to May 15, 2023, were identified using 4 databases: PubMed, Embase, IEEE Xplore, and Web of Science. We included all studies using DL algorithms trained on sequential diagnosis codes to predict patient outcomes. We excluded review articles and non-peer-reviewed papers. We evaluated the following aspects in the included papers: DL techniques, characteristics of the dataset, prediction tasks, performance evaluation, generalizability, and explainability. We also assessed the risk of bias and applicability of the studies using the Prediction Model Study Risk of Bias Assessment Tool (PROBAST). We used the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) checklist to report our findings.

RESULTS

Of the 740 identified papers, 84 (11.4%) met the eligibility criteria. Publications in this area increased yearly. Recurrent neural networks (and their derivatives; 47/84, 56%) and transformers (22/84, 26%) were the most commonly used architectures in DL-based models. Most studies (45/84, 54%) presented their input features as sequences of visit embeddings. Medications (38/84, 45%) were the most common additional feature. Of the 128 predictive outcome tasks, the most frequent was next-visit diagnosis (n=30, 23%), followed by heart failure (n=18, 14%) and mortality (n=17, 13%). Only 7 (8%) of the 84 studies evaluated their models in terms of generalizability. A positive correlation was observed between training sample size and model performance (area under the receiver operating characteristic curve; P=.02). However, 59 (70%) of the 84 studies had a high risk of bias.

CONCLUSIONS

The application of DL for advanced modeling of sequential medical codes has demonstrated remarkable promise in predicting patient outcomes. The main limitation of this study was the heterogeneity of methods and outcomes. However, our analysis found that using multiple types of features, integrating time intervals, and including larger sample sizes were generally related to an improved predictive performance. This review also highlights that very few studies (7/84, 8%) reported on challenges related to generalizability and less than half (38/84, 45%) of the studies reported on challenges related to explainability. Addressing these shortcomings will be instrumental in unlocking the full potential of DL for enhancing health care outcomes and patient care.

TRIAL REGISTRATION

PROSPERO CRD42018112161; https://tinyurl.com/yc6h9rwu.

摘要

背景

结构化电子健康记录在医疗系统中的应用迅速增长。这些系统收集了大量患者信息,包括代表时间病史的诊断代码。连续诊断信息已被证明对预测患者预后有价值。然而,这类数据被纳入深度学习(DL)模型的程度尚未得到研究。

目的

本系统评价旨在描述DL模型中连续诊断数据的使用情况,具体了解这些数据是如何整合的,样本量是否能提高性能,以及所识别的模型是否具有可推广性。

方法

使用4个数据库(PubMed、Embase、IEEE Xplore和Web of Science)识别截至2023年5月15日发表的相关研究。我们纳入了所有使用基于连续诊断代码训练的DL算法来预测患者预后的研究。我们排除了综述文章和非同行评审论文。我们在纳入的论文中评估了以下方面:DL技术、数据集特征、预测任务、性能评估、可推广性和可解释性。我们还使用预测模型研究偏倚风险评估工具(PROBAST)评估了研究的偏倚风险和适用性。我们使用PRISMA(系统评价和Meta分析的首选报告项目)清单来报告我们的发现。

结果

在740篇识别出的论文中,84篇(11.4%)符合纳入标准。该领域的出版物逐年增加。循环神经网络(及其衍生物;47/84,56%)和变换器(22/84,26%)是基于DL的模型中最常用的架构。大多数研究(45/84,54%)将其输入特征表示为就诊嵌入序列。药物(38/84,45%)是最常见的附加特征。在128项预测结果任务中,最常见的是下次就诊诊断(n = 30,23%),其次是心力衰竭(n = 18,14%)和死亡率(n = 17,13%)。84项研究中只有7项(8%)在可推广性方面评估了其模型。观察到训练样本量与模型性能(受试者操作特征曲线下面积;P =.02)之间存在正相关。然而,84项研究中有59项(70%)存在高偏倚风险。

结论

DL在连续医疗代码的高级建模中的应用在预测患者预后方面显示出显著的前景。本研究的主要局限性是方法和结果的异质性。然而,我们的分析发现,使用多种类型的特征、整合时间间隔以及纳入更大的样本量通常与预测性能的提高有关。本综述还强调,很少有研究(7/84,8%)报告与可推广性相关的挑战,不到一半(38/84,45%)的研究报告与可解释性相关的挑战。解决这些不足将有助于释放DL在改善医疗结果和患者护理方面的全部潜力。

试验注册

PROSPERO CRD42018112161;https://tinyurl.com/yc6h9rwu 。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f75/11962322/16ada805117b/jmir_v27i1e57358_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f75/11962322/18c64e6c93ee/jmir_v27i1e57358_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f75/11962322/f06d623eedbe/jmir_v27i1e57358_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f75/11962322/6de6ba37bbe3/jmir_v27i1e57358_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f75/11962322/16ada805117b/jmir_v27i1e57358_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f75/11962322/18c64e6c93ee/jmir_v27i1e57358_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f75/11962322/f06d623eedbe/jmir_v27i1e57358_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f75/11962322/6de6ba37bbe3/jmir_v27i1e57358_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f75/11962322/16ada805117b/jmir_v27i1e57358_fig4.jpg

相似文献

1
Enhancing Patient Outcome Prediction Through Deep Learning With Sequential Diagnosis Codes From Structured Electronic Health Record Data: Systematic Review.通过深度学习利用结构化电子健康记录数据中的顺序诊断代码增强患者预后预测:系统评价
J Med Internet Res. 2025 Mar 18;27:e57358. doi: 10.2196/57358.
2
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施:系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。
Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.
3
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
4
[Volume and health outcomes: evidence from systematic reviews and from evaluation of Italian hospital data].[容量与健康结果:来自系统评价和意大利医院数据评估的证据]
Epidemiol Prev. 2013 Mar-Jun;37(2-3 Suppl 2):1-100.
5
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益
Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.
6
Interventions for promoting habitual exercise in people living with and beyond cancer.促进癌症患者及康复者进行习惯性锻炼的干预措施。
Cochrane Database Syst Rev. 2018 Sep 19;9(9):CD010192. doi: 10.1002/14651858.CD010192.pub3.
7
Education support services for improving school engagement and academic performance of children and adolescents with a chronic health condition.改善患有慢性病的儿童和青少年的学校参与度和学业成绩的教育支持服务。
Cochrane Database Syst Rev. 2023 Feb 8;2(2):CD011538. doi: 10.1002/14651858.CD011538.pub2.
8
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病:网络荟萃分析。
Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.
9
Drugs for preventing postoperative nausea and vomiting in adults after general anaesthesia: a network meta-analysis.成人全身麻醉后预防术后恶心呕吐的药物:网状Meta分析
Cochrane Database Syst Rev. 2020 Oct 19;10(10):CD012859. doi: 10.1002/14651858.CD012859.pub2.
10
Health professionals' experience of teamwork education in acute hospital settings: a systematic review of qualitative literature.医疗专业人员在急症医院环境中团队合作教育的经验:对定性文献的系统综述
JBI Database System Rev Implement Rep. 2016 Apr;14(4):96-137. doi: 10.11124/JBISRIR-2016-1843.

引用本文的文献

1
A scoping review of self-supervised representation learning for clinical decision making using EHR categorical data.一项使用电子健康记录分类数据进行临床决策的自监督表征学习的范围综述。
NPJ Digit Med. 2025 Jun 14;8(1):362. doi: 10.1038/s41746-025-01692-1.

本文引用的文献

1
Race and Ethnicity Data in Electronic Health Records-Striving for Clarity.电子健康记录中的种族和族裔数据——力求清晰明了。
JAMA Netw Open. 2024 Mar 4;7(3):e240522. doi: 10.1001/jamanetworkopen.2024.0522.
2
A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises.医学成像中的深度学习综述:成像特征、技术趋势、具有进展亮点的案例研究及未来展望。
Proc IEEE Inst Electr Electron Eng. 2021 May;109(5):820-838. doi: 10.1109/JPROC.2021.3054390. Epub 2021 Feb 26.
3
Prediction models using artificial intelligence and longitudinal data from electronic health records: a systematic methodological review.
利用人工智能和电子健康记录的纵向数据进行预测模型:系统的方法学综述。
J Am Med Inform Assoc. 2023 Nov 17;30(12):2072-2082. doi: 10.1093/jamia/ocad168.
4
Perspectives on validation of clinical predictive algorithms.临床预测算法的验证视角。
NPJ Digit Med. 2023 May 6;6(1):86. doi: 10.1038/s41746-023-00832-9.
5
Machine Learning Approaches for Predicting Psoriatic Arthritis Risk Using Electronic Medical Records: Population-Based Study.基于人群的研究:利用电子病历预测银屑病关节炎风险的机器学习方法。
J Med Internet Res. 2023 Mar 28;25:e39972. doi: 10.2196/39972.
6
EHR foundation models improve robustness in the presence of temporal distribution shift.电子健康记录基础模型可提高在时间分布偏移情况下的稳健性。
Sci Rep. 2023 Mar 7;13(1):3767. doi: 10.1038/s41598-023-30820-8.
7
Validation of risk prediction models applied to longitudinal electronic health record data for the prediction of major cardiovascular events in the presence of data shifts.应用于纵向电子健康记录数据的风险预测模型在存在数据偏移情况下对主要心血管事件预测的验证
Eur Heart J Digit Health. 2022 Oct 21;3(4):535-547. doi: 10.1093/ehjdh/ztac061. eCollection 2022 Dec.
8
Transformer-based Multi-target Regression on Electronic Health Records for Primordial Prevention of Cardiovascular Disease.基于Transformer的电子健康记录多目标回归用于心血管疾病的一级预防
Proceedings (IEEE Int Conf Bioinformatics Biomed). 2021 Dec;2021:726-731. doi: 10.1109/bibm52615.2021.9669441. Epub 2022 Jan 14.
9
An integrated LSTM-HeteroRGNN model for interpretable opioid overdose risk prediction.基于集成 LSTM-HeteroRGNN 的可解释阿片类药物过量风险预测模型。
Artif Intell Med. 2023 Jan;135:102439. doi: 10.1016/j.artmed.2022.102439. Epub 2022 Nov 3.
10
MIMIC-IV, a freely accessible electronic health record dataset.MIMIC-IV,一个可自由访问的电子健康记录数据集。
Sci Data. 2023 Jan 3;10(1):1. doi: 10.1038/s41597-022-01899-x.