Chen Chen, Li Lei, Beetz Marcel, Banerjee Abhirup, Gupta Ramneek, Grau Vicente
Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford; Imperial College London; University of Sheffield, Sheffield.
Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford; University of Southampton.
IEEE Trans Big Data. 2025 Jun;11(3):948-960. doi: 10.1109/TBDATA.2025.3536922.
Heart failure (HF) poses a significant public health challenge, with a rising global mortality rate. Early detection and prevention of HF could significantly reduce its impact. We introduce a novel methodology for predicting HF risk using 12-lead electrocardiograms (ECGs). We present a novel, lightweight dual attention ECG network designed to capture complex ECG features essential for early HF risk prediction, despite the notable imbalance between low and high-risk groups. This network incorporates a cross-lead attention module and 12 lead-specific temporal attention modules, focusing on cross-lead interactions and each lead's local dynamics. To further alleviate model overfitting, we leverage a large language model (LLM) with a public ECG-Report dataset for pretraining on an ECG-Report alignment task. The network is then fine-tuned for HF risk prediction using two specific cohorts from the UK Biobank study, focusing on patients with hypertension (UKB-HYP) and those who have had a myocardial infarction (UKB-MI). The results reveal that LLM-informed pre-training substantially enhances HF risk prediction in these cohorts. The dual attention design not only improves interpretability but also predictive accuracy, outperforming existing competitive methods with C-index scores of 0.6349 for UKB-HYP and 0.5805 for UKB-MI. This demonstrates our method's potential in advancing HF risk assessment with clinical complex ECG data.
心力衰竭(HF)对公共卫生构成了重大挑战,全球死亡率不断上升。早期检测和预防HF可显著降低其影响。我们介绍了一种使用12导联心电图(ECG)预测HF风险的新方法。我们提出了一种新颖的轻量级双注意力ECG网络,旨在捕捉早期HF风险预测所需的复杂ECG特征,尽管低风险和高风险组之间存在明显的不平衡。该网络包含一个跨导联注意力模块和12个导联特定的时间注意力模块,专注于跨导联交互和每个导联的局部动态。为了进一步缓解模型过拟合,我们利用一个大型语言模型(LLM)和一个公共ECG报告数据集,在ECG报告对齐任务上进行预训练。然后,使用来自英国生物银行研究的两个特定队列,对网络进行微调以预测HF风险,重点关注高血压患者(UKB-HYP)和心肌梗死患者(UKB-MI)。结果表明,基于LLM的预训练显著提高了这些队列中的HF风险预测。双注意力设计不仅提高了可解释性,还提高了预测准确性,在UKB-HYP队列中的C指数得分为0.6349,在UKB-MI队列中的C指数得分为0.5805,优于现有的竞争方法。这证明了我们的方法在利用临床复杂ECG数据推进HF风险评估方面的潜力。