Li Xiang, Liu Haifeng, Du Xin, Zhang Ping, Hu Gang, Xie Guotong, Guo Shijing, Xu Meilin, Xie Xiaoping
IBM Research - China, Beijing, China.
Department of Cardiology, Beijing Anzhen Hospital, Beijing, China.
AMIA Annu Symp Proc. 2017 Feb 10;2016:799-807. eCollection 2016.
Atrial fibrillation (AF) is a common cardiac rhythm disorder, which increases the risk of ischemic stroke and other thromboembolism (TE). Accurate prediction of TE is highly valuable for early intervention to AF patients. However, the prediction performance of previous TE risk models for AF is not satisfactory. In this study, we used integrated machine learning and data mining approaches to build 2-year TE prediction models for AF from Chinese Atrial Fibrillation Registry data. We first performed data cleansing and imputation on the raw data to generate available dataset. Then a series of feature construction and selection methods were used to identify predictive risk factors, based on which supervised learning methods were applied to build the prediction models. The experimental results show that our approach can achieve higher prediction performance (AUC: 0.710.74) than previous TE prediction models for AF (AUC: 0.660.69), and identify new potential risk factors as well.
心房颤动(AF)是一种常见的心律失常,会增加缺血性中风和其他血栓栓塞(TE)的风险。准确预测TE对于房颤患者的早期干预具有重要价值。然而,先前用于房颤的TE风险模型的预测性能并不令人满意。在本研究中,我们使用集成机器学习和数据挖掘方法,根据中国房颤注册数据构建房颤的2年TE预测模型。我们首先对原始数据进行数据清理和插补,以生成可用数据集。然后使用一系列特征构建和选择方法来识别预测风险因素,并在此基础上应用监督学习方法构建预测模型。实验结果表明,我们的方法比先前用于房颤的TE预测模型(AUC:0.660.69)具有更高的预测性能(AUC:0.710.74),并且还识别出了新的潜在风险因素。