文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

基于自动编码器的电子健康记录中相似患者检索的表示学习:比较研究

Autoencoder-Based Representation Learning for Similar Patients Retrieval From Electronic Health Records: Comparative Study.

作者信息

Li Deyi, Shukla Aditi, Chandaka Sravani, Taylor Bradley, Xu Jie, Liu Mei

机构信息

Department of Health Outcomes & Biomedical Informatics, University of Florida, 1889 Museum Rd, 7th Floor, Suite 7000, Room 7012, Gainesville, FL, 32611, United States, 1 352-627-9143.

Department of Mathematics, College of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, United States.

出版信息

JMIR Med Inform. 2025 Jul 24;13:e68830. doi: 10.2196/68830.


DOI:10.2196/68830
PMID:40706557
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12289314/
Abstract

BACKGROUND: By analyzing electronic health record snapshots of similar patients, physicians can proactively predict disease onsets, customize treatment plans, and anticipate patient-specific trajectories. However, the modeling of electronic health record data is inherently challenging due to its high dimensionality, mixed feature types, noise, bias, and sparsity. Patient representation learning using autoencoders (AEs) presents promising opportunities to address these challenges. A critical question remains: how do different AE designs and distance measures impact the quality of retrieved similar patient cohorts? OBJECTIVE: This study aims to evaluate the performance of 5 common AE variants-vanilla autoencoder, denoising autoencoder, contractive autoencoder, sparse autoencoder, and robust autoencoder-in retrieving similar patients. Additionally, it investigates the impact of different distance measures and hyperparameter configurations on model performance. METHODS: We tested the 5 AE variants on 2 real-world datasets-the University of Kansas Medical Center (n=13,752) and the Medical College of Wisconsin (n=9568)-across 168 different hyperparameter configurations. To retrieve similar patients based on the AE-produced latent representations, we applied k-nearest neighbors (k-NN) using Euclidean and Mahalanobis distances. Two prediction targets were evaluated: acute kidney injury onset and postdischarge 1-year mortality. RESULTS: Our findings demonstrate that (1) denoising autoencoders outperformed other AE variants when paired with Euclidean distance (P<.001), followed by vanilla autoencoders and contractive autoencoders; (2) learning rates significantly influenced the performance of AE variants; and (3) Mahalanobis distance-based k-NN frequently outperformed Euclidean distance-based k-NN when applied to latent representations. However, whether AE models are superior in transforming raw data into latent representations, compared with applying Mahalanobis distance-based k-NN directly to raw data, appears to be data-dependent. CONCLUSIONS: This study provides a comprehensive analysis of the performance of different AE variants in retrieving similar patients and evaluates the impact of various hyperparameter configurations on model performance. The findings lay the groundwork for future development of AE-based patient similarity estimation and personalized medicine.

摘要

背景:通过分析相似患者的电子健康记录快照,医生可以主动预测疾病发作、定制治疗方案并预测患者特定的病程。然而,由于电子健康记录数据具有高维度、混合特征类型、噪声、偏差和稀疏性,对其进行建模具有内在的挑战性。使用自动编码器(AE)进行患者表示学习为应对这些挑战提供了有希望的机会。一个关键问题仍然存在:不同的AE设计和距离度量如何影响检索到的相似患者队列的质量? 目的:本研究旨在评估5种常见AE变体——普通自动编码器、去噪自动编码器、收缩自动编码器、稀疏自动编码器和鲁棒自动编码器——在检索相似患者方面的性能。此外,还研究了不同距离度量和超参数配置对模型性能的影响。 方法:我们在2个真实世界数据集——堪萨斯大学医学中心(n = 13752)和威斯康星医学院(n = 9568)——上测试了这5种AE变体,涉及168种不同的超参数配置。为了基于AE生成的潜在表示检索相似患者,我们使用欧几里得距离和马氏距离应用k近邻(k-NN)算法。评估了两个预测目标:急性肾损伤发作和出院后1年死亡率。 结果:我们的研究结果表明:(1)与欧几里得距离配对时,去噪自动编码器的表现优于其他AE变体(P <.001),其次是普通自动编码器和收缩自动编码器;(2)学习率显著影响AE变体的性能;(3)应用于潜在表示时,基于马氏距离的k-NN通常优于基于欧几里得距离的k-NN。然而,与直接将基于马氏距离的k-NN应用于原始数据相比,AE模型在将原始数据转换为潜在表示方面是否更具优势似乎取决于数据。 结论:本研究对不同AE变体在检索相似患者方面的性能进行了全面分析,并评估了各种超参数配置对模型性能的影响。研究结果为基于AE的患者相似性估计和个性化医疗的未来发展奠定了基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e61/12289314/4b4c7d91aa7c/medinform-v13-e68830-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e61/12289314/d7b5d9d55295/medinform-v13-e68830-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e61/12289314/0e1d4f6b683a/medinform-v13-e68830-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e61/12289314/e6e90a033059/medinform-v13-e68830-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e61/12289314/67d07c04773f/medinform-v13-e68830-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e61/12289314/4b4c7d91aa7c/medinform-v13-e68830-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e61/12289314/d7b5d9d55295/medinform-v13-e68830-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e61/12289314/0e1d4f6b683a/medinform-v13-e68830-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e61/12289314/e6e90a033059/medinform-v13-e68830-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e61/12289314/67d07c04773f/medinform-v13-e68830-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7e61/12289314/4b4c7d91aa7c/medinform-v13-e68830-g005.jpg

相似文献

[1]
Autoencoder-Based Representation Learning for Similar Patients Retrieval From Electronic Health Records: Comparative Study.

JMIR Med Inform. 2025-7-24

[2]
Prescription of Controlled Substances: Benefits and Risks

2025-1

[3]
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.

Clin Orthop Relat Res. 2024-12-1

[4]
Eliciting adverse effects data from participants in clinical trials.

Cochrane Database Syst Rev. 2018-1-16

[5]
Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.

Clin Orthop Relat Res. 2024-1-1

[6]
Automated devices for identifying peripheral arterial disease in people with leg ulceration: an evidence synthesis and cost-effectiveness analysis.

Health Technol Assess. 2024-8

[7]
Short-Term Memory Impairment

2025-1

[8]
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?

Clin Orthop Relat Res. 2024-9-1

[9]
Health professionals' experience of teamwork education in acute hospital settings: a systematic review of qualitative literature.

JBI Database System Rev Implement Rep. 2016-4

[10]
Sexual Harassment and Prevention Training

2025-1

本文引用的文献

[1]
A Patient Similarity Network (CHDmap) to Predict Outcomes After Congenital Heart Surgery: Development and Validation Study.

JMIR Med Inform. 2024-1-19

[2]
Development and Validation of a Personalized Model With Transfer Learning for Acute Kidney Injury Risk Estimation Using Electronic Health Records.

JAMA Netw Open. 2022-7-1

[3]
Sequential Data-Based Patient Similarity Framework for Patient Outcome Prediction: Algorithm Development.

J Med Internet Res. 2022-1-6

[4]
Generating sequential electronic health records using dual adversarial autoencoder.

J Am Med Inform Assoc. 2020-7-1

[5]
Deep representation learning of electronic health records to unlock patient stratification at scale.

NPJ Digit Med. 2020-7-17

[6]
Parameter tuning is a key part of dimensionality reduction via deep variational autoencoders for single cell RNA transcriptomics.

Pac Symp Biocomput. 2019

[7]
The clinical heterogeneity of Parkinson's disease and its therapeutic implications.

Eur J Neurosci. 2018-10-14

[8]
Patient similarity for precision medicine: A systematic review.

J Biomed Inform. 2018-6-1

[9]
Secondary use of electronic medical records for clinical research: Challenges and Opportunities.

Converg Sci Phys Oncol. 2018-3

[10]
Personalized medicine could transform healthcare.

Biomed Rep. 2017-7

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索