VTT Technical Research Centre of Finland Ltd., 33101, Tampere, Finland.
Faculty of Medicine and Health Technology, Tampere University, 33720, Tampere, Finland.
Sci Rep. 2023 Mar 2;13(1):3517. doi: 10.1038/s41598-023-30657-1.
With over 17 million annual deaths, cardiovascular diseases (CVDs) dominate the cause of death statistics. CVDs can deteriorate the quality of life drastically and even cause sudden death, all the while inducing massive healthcare costs. This work studied state-of-the-art deep learning techniques to predict increased risk of death in CVD patients, building on the electronic health records (EHR) of over 23,000 cardiac patients. Taking into account the usefulness of the prediction for chronic disease patients, a prediction period of six months was selected. Two major transformer models that rely on learning bidirectional dependencies in sequential data, BERT and XLNet, were trained and compared. To our knowledge, the presented work is the first to apply XLNet on EHR data to predict mortality. The patient histories were formulated as time series consisting of varying types of clinical events, thus enabling the model to learn increasingly complex temporal dependencies. BERT and XLNet achieved an average area under the receiver operating characteristic curve (AUC) of 75.5% and 76.0%, respectively. XLNet surpassed BERT in recall by 9.8%, suggesting that it captures more positive cases than BERT, which is the main focus of recent research on EHRs and transformers.
心血管疾病(CVDs)每年导致超过 1700 万人死亡,是死亡原因统计中的主要因素。CVDs 会极大地降低生活质量,甚至导致猝死,同时还会带来巨大的医疗成本。本研究利用深度学习技术,基于超过 23000 名心脏病患者的电子健康记录(EHR),预测 CVD 患者的死亡风险。考虑到预测对慢性病患者的有用性,选择了六个月的预测期。我们训练并比较了两种主要的基于学习顺序数据中双向依赖关系的转换器模型,BERT 和 XLNet。据我们所知,目前的工作首次将 XLNet 应用于 EHR 数据来预测死亡率。患者病史被构造成由各种类型的临床事件组成的时间序列,从而使模型能够学习越来越复杂的时间依赖性。BERT 和 XLNet 的平均接收者操作特征曲线下面积(AUC)分别为 75.5%和 76.0%。XLNet 在召回率上比 BERT 高出 9.8%,这表明它比 BERT 捕获了更多的阳性病例,而 BERT 是最近 EHR 和转换器研究的主要关注点。