Lv Hang, Chen Zehai, Yang Yacong, Pan Shuyao, Xiong Bo, Tan Yanchao, Yang Carl
College of Computer and Data Science, Fuzhou University, Fuzhou, China.
Shengli Clinical Medical College, Fujian Medical University, Fuzhou, China.
AMIA Annu Symp Proc. 2025 May 22;2024:758-767. eCollection 2024.
Electronic Health Records (EHRs) are valuable healthcare data, aiding researchers and doctors in improving diagnosis accuracy. Researchers have developed several predictive models by learning disease representations to forecast the potential diagnosis that patients may receive. However, existing studies usually ignore the fine-grained semantic and structure information in EHRs (e.g., the hierarchical relations between diseases and ICD-9 codes), which fails to provide accurate disease representation towards effective diagnosis prediction. To this end, we propose to enhance diagnosis prediction through LabCare, a framework with improved semantic and structure modeling of diseases in EHR data. LabCare can simultaneously capture rich semantic and structural relations among diseases and ICD-9 codes, which is achieved by innovatively integrating language models and box embeddings. Extensive experiments on two EHR datasets show that LabCare surpasses competitors, consistently achieving a 4.29% average improvement in Recall and NDCG metrics.
电子健康记录(EHRs)是宝贵的医疗保健数据,有助于研究人员和医生提高诊断准确性。研究人员通过学习疾病表征开发了几种预测模型,以预测患者可能得到的潜在诊断。然而,现有研究通常忽略了电子健康记录中的细粒度语义和结构信息(例如疾病与ICD-9编码之间的层次关系),这无法为有效的诊断预测提供准确的疾病表征。为此,我们提出通过LabCare来增强诊断预测,LabCare是一个对电子健康记录数据中的疾病进行改进的语义和结构建模的框架。LabCare可以同时捕捉疾病与ICD-9编码之间丰富的语义和结构关系,这是通过创新地整合语言模型和盒嵌入实现的。在两个电子健康记录数据集上进行的大量实验表明,LabCare优于竞争对手,在召回率和归一化折损累计增益指标上平均持续提高4.29%。