• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于电子健康记录的半监督双深度学习时间风险预测(SeDDLeR)

Semi-supervised Double Deep Learning Temporal Risk Prediction (SeDDLeR) with Electronic Health Records.

机构信息

Department of Biostatistics, Harvard T.H. Chan School of Public Health, United States of America.

Department of Biomedical Informatics, Harvard Medical School, United States of America.

出版信息

J Biomed Inform. 2024 Sep;157:104685. doi: 10.1016/j.jbi.2024.104685. Epub 2024 Jul 14.

DOI:10.1016/j.jbi.2024.104685
PMID:39004109
Abstract

BACKGROUND

Risk prediction plays a crucial role in planning for prevention, monitoring, and treatment. Electronic Health Records (EHRs) offer an expansive repository of temporal medical data encompassing both risk factors and outcome indicators essential for effective risk prediction. However, challenges emerge due to the lack of readily available gold-standard outcomes and the complex effects of various risk factors. Compounding these challenges are the false positives in diagnosis codes, and formidable task of pinpointing the onset timing in annotations.

OBJECTIVE

We develop a Semi-supervised Double Deep Learning Temporal Risk Prediction (SeDDLeR) algorithm based on extensive unlabeled longitudinal Electronic Health Records (EHR) data augmented by a limited set of gold standard labels on the binary status information indicating whether the clinical event of interest occurred during the follow-up period.

METHODS

The SeDDLeR algorithm calculates an individualized risk of developing future clinical events over time using each patient's baseline EHR features via the following steps: (1) construction of an initial EHR-derived surrogate as a proxy for the onset status; (2) deep learning calibration of the surrogate along gold-standard onset status; and (3) semi-supervised deep learning for risk prediction combining calibrated surrogates and gold-standard onset status. To account for missing onset time and heterogeneous follow-up, we introduce temporal kernel weighting. We devise a Gated Recurrent Units (GRUs) module to capture temporal characteristics. We subsequently assess our proposed SeDDLeR method in simulation studies and apply the method to the Massachusetts General Brigham (MGB) Biobank to predict type 2 diabetes (T2D) risk.

RESULTS

SeDDLeR outperforms benchmark risk prediction methods, including Semi-parametric Transformation Model (STM) and DeepHit, with consistently best accuracy across experiments. SeDDLeR achieved the best C-statistics ( 0.815, SE 0.023; vs STM +.084, SE 0.030, P-value .004; vs DeepHit +.055, SE 0.027, P-value .024) and best average time-specific AUC (0.778, SE 0.022; vs STM + 0.059, SE 0.039, P-value .067; vs DeepHit + 0.168, SE 0.032, P-value <0.001) in the MGB T2D study.

CONCLUSION

SeDDLeR can train robust risk prediction models in both real-world EHR and synthetic datasets with minimal requirements of labeling event times. It holds the potential to be incorporated for future clinical trial recruitment or clinical decision-making.

摘要

背景

风险预测在预防、监测和治疗规划中起着至关重要的作用。电子健康记录 (EHR) 提供了一个广泛的时间医学数据存储库,其中包含对有效风险预测至关重要的风险因素和结果指标。然而,由于缺乏现成的金标准结果以及各种风险因素的复杂影响,出现了挑战。此外,诊断代码中还存在假阳性,并且在注释中确定发病时间也是一项艰巨的任务。

目的

我们基于广泛的无标签纵向电子健康记录 (EHR) 数据开发了一种基于半监督双深度学习时间风险预测 (SeDDLeR) 的算法,并使用有限数量的金标准标签对二进制状态信息进行补充,该信息表示在随访期间是否发生了感兴趣的临床事件。

方法

SeDDLeR 算法通过以下步骤使用每个患者的基线 EHR 特征计算未来临床事件的个体化风险:(1)构建一个初始 EHR 衍生的替代物,作为发病状态的代理;(2)沿着金标准发病状态对替代物进行深度学习校准;(3)使用校准的替代物和金标准发病状态进行半监督深度学习进行风险预测。为了考虑到发病时间缺失和异质随访,我们引入了时间核加权。我们设计了一个门控循环单元 (GRU) 模块来捕获时间特征。随后,我们在模拟研究中评估了我们提出的 SeDDLeR 方法,并将该方法应用于马萨诸塞州综合医院 (MGB) 生物库来预测 2 型糖尿病 (T2D) 风险。

结果

SeDDLeR 优于基准风险预测方法,包括半参数变换模型 (STM) 和 DeepHit,在实验中始终具有最佳的准确性。SeDDLeR 实现了最佳的 C 统计量 (0.815,SE 0.023;与 STM +.084,SE 0.030,P 值.004;与 DeepHit +.055,SE 0.027,P 值.024) 和最佳平均时间特异性 AUC (0.778,SE 0.022;与 STM + 0.059,SE 0.039,P 值.067;与 DeepHit + 0.168,SE 0.032,P 值 <0.001) 在 MGB T2D 研究中。

结论

SeDDLeR 可以在真实世界的 EHR 和合成数据集以及最小的标签时间要求下训练稳健的风险预测模型。它有可能被纳入未来的临床试验招募或临床决策。

相似文献

1
Semi-supervised Double Deep Learning Temporal Risk Prediction (SeDDLeR) with Electronic Health Records.基于电子健康记录的半监督双深度学习时间风险预测(SeDDLeR)
J Biomed Inform. 2024 Sep;157:104685. doi: 10.1016/j.jbi.2024.104685. Epub 2024 Jul 14.
2
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
3
Trajectory-Ordered Objectives for Self-Supervised Representation Learning of Temporal Healthcare Data Using Transformers: Model Development and Evaluation Study.使用Transformer进行时间序列医疗数据自监督表示学习的轨迹有序目标:模型开发与评估研究
JMIR Med Inform. 2025 Jun 4;13:e68138. doi: 10.2196/68138.
4
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
5
A segment anything model-guided and match-based semi-supervised segmentation framework for medical imaging.一种用于医学成像的基于段式分割模型引导和匹配的半监督分割框架。
Med Phys. 2025 Mar 29. doi: 10.1002/mp.17785.
6
The Use of Deep Learning and Machine Learning on Longitudinal Electronic Health Records for the Early Detection and Prevention of Diseases: Scoping Review.深度学习和机器学习在纵向电子健康记录中用于疾病的早期检测和预防的应用:范围综述。
J Med Internet Res. 2024 Aug 20;26:e48320. doi: 10.2196/48320.
7
Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能?开发一种互联网应用算法。
Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.
8
Predictive modeling of complications arising from early-onset preeclampsia in pregnant women.早发型子痫前期孕妇并发症的预测模型
Womens Health (Lond). 2025 Jan-Dec;21:17455057251348978. doi: 10.1177/17455057251348978. Epub 2025 Jul 21.
9
Semi-Supervised Learning Allows for Improved Segmentation With Reduced Annotations of Brain Metastases Using Multicenter MRI Data.半监督学习可利用多中心MRI数据,通过减少脑转移瘤的标注来改进分割。
J Magn Reson Imaging. 2025 Jun;61(6):2469-2479. doi: 10.1002/jmri.29686. Epub 2025 Jan 10.
10
Enhancing Patient Outcome Prediction Through Deep Learning With Sequential Diagnosis Codes From Structured Electronic Health Record Data: Systematic Review.通过深度学习利用结构化电子健康记录数据中的顺序诊断代码增强患者预后预测:系统评价
J Med Internet Res. 2025 Mar 18;27:e57358. doi: 10.2196/57358.