• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

范围:使用电子健康记录预测门诊中的未来诊断。

SCOPE: predicting future diagnoses in office visits using electronic health records.

机构信息

Department of Medicine, Stanford Center for Biomedical Informatics, Stanford University, 1265 Welch Rd, Palo Alto, CA, 94305, USA.

Department of Biomedical Data Science, Stanford University, Palo Alto, CA, USA.

出版信息

Sci Rep. 2023 Jul 7;13(1):11005. doi: 10.1038/s41598-023-38257-9.

DOI:10.1038/s41598-023-38257-9
PMID:37419945
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10328934/
Abstract

We propose an interpretable and scalable model to predict likely diagnoses at an encounter based on past diagnoses and lab results. This model is intended to aid physicians in their interaction with the electronic health records (EHR). To accomplish this, we retrospectively collected and de-identified EHR data of 2,701,522 patients at Stanford Healthcare over a time period from January 2008 to December 2016. A population-based sample of patients comprising 524,198 individuals (44% M, 56% F) with multiple encounters with at least one frequently occurring diagnosis codes were chosen. A calibrated model was developed to predict ICD-10 diagnosis codes at an encounter based on the past diagnoses and lab results, using a binary relevance based multi-label modeling strategy. Logistic regression and random forests were tested as the base classifier, and several time windows were tested for aggregating the past diagnoses and labs. This modeling approach was compared to a recurrent neural network based deep learning method. The best model used random forest as the base classifier and integrated demographic features, diagnosis codes, and lab results. The best model was calibrated and its performance was comparable or better than existing methods in terms of various metrics, including a median AUROC of 0.904 (IQR [0.838, 0.954]) over 583 diseases. When predicting the first occurrence of a disease label for a patient, the median AUROC with the best model was 0.796 (IQR [0.737, 0.868]). Our modeling approach performed comparably as the tested deep learning method, outperforming it in terms of AUROC (p < 0.001) but underperforming in terms of AUPRC (p < 0.001). Interpreting the model showed that the model uses meaningful features and highlights many interesting associations among diagnoses and lab results. We conclude that the multi-label model performs comparably with RNN based deep learning model while offering simplicity and potentially superior interpretability. While the model was trained and validated on data obtained from a single institution, its simplicity, interpretability and performance makes it a promising candidate for deployment.

摘要

我们提出了一种可解释且可扩展的模型,用于根据过去的诊断和实验室结果预测就诊时的可能诊断。该模型旨在帮助医生与电子健康记录 (EHR) 进行交互。为此,我们回顾性地收集了斯坦福医疗保健中心 2701522 名患者的 EHR 数据,时间跨度为 2008 年 1 月至 2016 年 12 月。选择了一个基于人群的患者样本,包含 524198 名个体(44%为男性,56%为女性),他们有多次就诊经历,至少有一次经常出现的诊断代码。使用基于二元相关性的多标签建模策略,开发了一种基于过去诊断和实验室结果预测就诊时 ICD-10 诊断代码的校准模型。测试了逻辑回归和随机森林作为基础分类器,并测试了几个时间窗口来聚合过去的诊断和实验室结果。将这种建模方法与基于递归神经网络的深度学习方法进行了比较。最佳模型使用随机森林作为基础分类器,并集成了人口统计学特征、诊断代码和实验室结果。最佳模型经过校准,其性能在各种指标方面与现有方法相当或更好,包括 583 种疾病的中位数 AUROC 为 0.904(IQR [0.838,0.954])。当预测患者疾病标签的首次出现时,最佳模型的中位数 AUROC 为 0.796(IQR [0.737,0.868])。我们的建模方法与测试的深度学习方法性能相当,在 AUROC 方面优于后者(p < 0.001),但在 AUPRC 方面劣于后者(p < 0.001)。对模型进行解释表明,该模型使用了有意义的特征,并突出了诊断和实验室结果之间的许多有趣关联。我们得出结论,多标签模型与基于 RNN 的深度学习模型性能相当,同时提供了简单性和潜在的更高可解释性。虽然该模型是在从单一机构获得的数据上进行训练和验证的,但它的简单性、可解释性和性能使其成为部署的有前途的候选者。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42ce/10328934/16e3fc21173f/41598_2023_38257_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42ce/10328934/914cefe246ec/41598_2023_38257_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42ce/10328934/7b759ebb70ce/41598_2023_38257_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42ce/10328934/79cd45faa8f8/41598_2023_38257_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42ce/10328934/6db4948410e9/41598_2023_38257_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42ce/10328934/16e3fc21173f/41598_2023_38257_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42ce/10328934/914cefe246ec/41598_2023_38257_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42ce/10328934/7b759ebb70ce/41598_2023_38257_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42ce/10328934/79cd45faa8f8/41598_2023_38257_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42ce/10328934/6db4948410e9/41598_2023_38257_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/42ce/10328934/16e3fc21173f/41598_2023_38257_Fig5_HTML.jpg

相似文献

1
SCOPE: predicting future diagnoses in office visits using electronic health records.范围:使用电子健康记录预测门诊中的未来诊断。
Sci Rep. 2023 Jul 7;13(1):11005. doi: 10.1038/s41598-023-38257-9.
2
TA-RNN: an attention-based time-aware recurrent neural network architecture for electronic health records.TA-RNN:一种基于注意力的时间感知循环神经网络架构,用于电子健康记录。
Bioinformatics. 2024 Jun 28;40(Suppl 1):i169-i179. doi: 10.1093/bioinformatics/btae264.
3
Predicting Gastrointestinal Bleeding Events from Multimodal In-Hospital Electronic Health Records Using Deep Fusion Networks.使用深度融合网络从多模式住院电子健康记录预测胃肠道出血事件。
Annu Int Conf IEEE Eng Med Biol Soc. 2019 Jul;2019:2447-2450. doi: 10.1109/EMBC.2019.8857244.
4
Subcategorizing EHR diagnosis codes to improve clinical application of machine learning models.对电子健康记录诊断代码进行细分,以提高机器学习模型的临床应用。
Int J Med Inform. 2021 Dec;156:104588. doi: 10.1016/j.ijmedinf.2021.104588. Epub 2021 Sep 21.
5
Dynamic ElecTronic hEalth reCord deTection (DETECT) of individuals at risk of a first episode of psychosis: a case-control development and validation study.动态电子健康记录检测(DETECT)对首发精神病风险个体的识别:一项病例对照研究。
Lancet Digit Health. 2020 May;2(5):e229-e239. doi: 10.1016/S2589-7500(20)30024-8. Epub 2020 Mar 26.
6
Interpretable time-aware and co-occurrence-aware network for medical prediction.用于医学预测的可解释的时间感知和共现感知网络。
BMC Med Inform Decis Mak. 2021 Nov 2;21(1):305. doi: 10.1186/s12911-021-01662-z.
7
Incorporating medical code descriptions for diagnosis prediction in healthcare.将医疗代码描述纳入医疗保健中的诊断预测。
BMC Med Inform Decis Mak. 2019 Dec 19;19(Suppl 6):267. doi: 10.1186/s12911-019-0961-2.
8
Predicting Postoperative Mortality With Deep Neural Networks and Natural Language Processing: Model Development and Validation.使用深度神经网络和自然语言处理预测术后死亡率:模型开发与验证
JMIR Med Inform. 2022 May 10;10(5):e38241. doi: 10.2196/38241.
9
LSTM Model for Prediction of Heart Failure in Big Data.基于大数据的心力衰竭预测 LSTM 模型
J Med Syst. 2019 Mar 19;43(5):111. doi: 10.1007/s10916-019-1243-3.
10
Early Detection of Septic Shock Onset Using Interpretable Machine Learners.使用可解释机器学习算法早期检测脓毒症休克发作
J Clin Med. 2021 Jan 15;10(2):301. doi: 10.3390/jcm10020301.

引用本文的文献

1
Advancing Clinical Information Systems: Harnessing Telemedicine, Data Science, and AI for Enhanced and More Precise Healthcare Delivery.推进临床信息系统:利用远程医疗、数据科学和人工智能实现更高效、精准的医疗服务。
Yearb Med Inform. 2024 Aug;33(1):115-122. doi: 10.1055/s-0044-1800730. Epub 2025 Apr 8.
2
Evaluating dimensionality reduction of comorbidities for predictive modeling in individuals with neurofibromatosis type 1.评估1型神经纤维瘤病患者共病的降维用于预测建模。
JAMIA Open. 2025 Jan 22;8(1):ooae157. doi: 10.1093/jamiaopen/ooae157. eCollection 2025 Feb.
3
Med-MGF: multi-level graph-based framework for handling medical data imbalance and representation.

本文引用的文献

1
Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead.停止为高风险决策解释黑箱机器学习模型,转而使用可解释模型。
Nat Mach Intell. 2019 May;1(5):206-215. doi: 10.1038/s42256-019-0048-x. Epub 2019 May 13.
2
Array programming with NumPy.使用 NumPy 进行数组编程。
Nature. 2020 Sep;585(7825):357-362. doi: 10.1038/s41586-020-2649-2. Epub 2020 Sep 16.
3
Personalized predictions of patient outcomes during and after hospitalization using artificial intelligence.使用人工智能对患者住院期间及出院后的预后进行个性化预测。
Med-MGF:基于多层次图的医学数据不平衡和表示处理框架。
BMC Med Inform Decis Mak. 2024 Sep 2;24(1):242. doi: 10.1186/s12911-024-02649-2.
NPJ Digit Med. 2020 Apr 3;3:51. doi: 10.1038/s41746-020-0249-z. eCollection 2020.
4
Deep learning for electronic health records: A comparative review of multiple deep neural architectures.深度学习在电子健康记录中的应用:多种深度神经网络架构的比较综述。
J Biomed Inform. 2020 Jan;101:103337. doi: 10.1016/j.jbi.2019.103337.
5
Clinical implementation of AI technologies will require interpretable AI models.人工智能技术的临床应用将需要可解释的人工智能模型。
Med Phys. 2020 Jan;47(1):1-4. doi: 10.1002/mp.13891. Epub 2019 Nov 19.
6
Scalable and accurate deep learning with electronic health records.借助电子健康记录实现可扩展且准确的深度学习。
NPJ Digit Med. 2018 May 8;1:18. doi: 10.1038/s41746-018-0029-1. eCollection 2018.
7
Deep Learning on Electronic Health Records to Improve Disease Coding Accuracy.基于电子健康记录的深度学习以提高疾病编码准确性。
AMIA Jt Summits Transl Sci Proc. 2019 May 6;2019:620-629. eCollection 2019.
8
Adversarial attacks on medical machine learning.对医学机器学习的对抗攻击。
Science. 2019 Mar 22;363(6433):1287-1289. doi: 10.1126/science.aaw4399.
9
Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis.深度电子健康记录(EHR):深度学习技术在电子健康记录(EHR)分析中的最新进展综述。
IEEE J Biomed Health Inform. 2018 Sep;22(5):1589-1604. doi: 10.1109/JBHI.2017.2767063. Epub 2017 Oct 27.
10
Accuracy and Completeness of Clinical Coding Using ICD-10 for Ambulatory Visits.使用国际疾病分类第十版(ICD - 10)进行门诊就诊临床编码的准确性和完整性。
AMIA Annu Symp Proc. 2018 Apr 16;2017:912-920. eCollection 2017.