Hong Caogen, Chen Jinbiao, Yi Fan, Hao Yuzhe, Meng Fanwen, Dong Zhanghuiya, Lin Hui, Huang Zhengxing
Zhejiang University, Hangzhou, Zhejiang China.
Jiangsu Automation Research Institute, Lianyungang, China.
Health Inf Sci Syst. 2022 Apr 12;10(1):5. doi: 10.1007/s13755-022-00173-z. eCollection 2022 Dec.
Survival analysis, aimed at investigating the relationships between covariates and event time, has exhibited profound effects on health service management. Longitudinal data with sequential patterns, such as electronic health records (EHRs), contain a large volume of patient treatment trajectories, and therefore, provide great potential for survival analysis. However, most existing studies address the survival analysis problem in a static manner, that is, they only utilize a fraction of longitudinal data, ignore the correlations between multiple visits, and usually may not be able to capture the latent representations of patient treatment trajectories. This inevitably deteriorates the performance of the survival analysis. To address this challenge, we propose an end-to-end contrastive-based model to better understand the patient treatment trajectories and dynamically predict the survival probability of a target patient. Specifically, two data augmentation strategies, namely, and , are adopted to augment the real treatment trajectories documented in the EHR. Based on this, the hidden representations of the real trajectories can be improved by utilizing contrastive learning between augmented and real trajectories. We evaluated our proposed CD-Surv on two real-world datasets, and the experimental results indicated that our proposed model could outperform state-of-the-art baselines on various evaluation metrics.
生存分析旨在研究协变量与事件时间之间的关系,已对卫生服务管理产生了深远影响。具有顺序模式的纵向数据,如电子健康记录(EHR),包含大量患者治疗轨迹,因此为生存分析提供了巨大潜力。然而,大多数现有研究以静态方式处理生存分析问题,即它们仅利用一部分纵向数据,忽略多次就诊之间的相关性,并且通常可能无法捕捉患者治疗轨迹的潜在表示。这不可避免地会降低生存分析的性能。为应对这一挑战,我们提出了一种基于端到端对比的模型,以更好地理解患者治疗轨迹并动态预测目标患者的生存概率。具体而言,采用了两种数据增强策略(即 和 )来增强EHR中记录的真实治疗轨迹。基于此,可以通过利用增强轨迹与真实轨迹之间的对比学习来改进真实轨迹的隐藏表示。我们在两个真实世界数据集上评估了我们提出的CD - Surv,实验结果表明,我们提出的模型在各种评估指标上均优于现有最先进的基线模型。