Zhang Yuqi, Li Sijin, Mai Peibiao, Yang Yanqi, Luo Niansang, Tong Chao, Zeng Kuan, Zhang Kun
School of Computer Science & Engineering, Beihang University, Beijing, China.
State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China.
BMC Med Inform Decis Mak. 2025 Feb 3;25(1):51. doi: 10.1186/s12911-025-02880-5.
There is no effective way to accurately predict paroxysmal and persistent atrial fibrillation (AF) subtypes unless electrocardiogram (ECG) observation is obtained. We aim to develop a predictive model using a machine learning algorithm for identification of paroxysmal and persistent AF, and investigate the influencing factors.
We collected demographic data, medication use, serological indicators, and baseline cardiac ultrasound data of all included subjects, totaling 50 variables. The diagnosis of AF subtypes is confirmed by ECG observation for at least more than 7 days. Variable selection was performed by spearman correlation analysis, recursive feature elimination, and least absolute shrinkage and selection operator regression. We built a prediction model for AF using three machine learning methods. Finally, the significance of each variable was analyzed by Shapley additive explanations method.
After screening, we found the optimal variable set consisting of 10 variables. The model we built achieved good predictive performance (AUC = 0.870, 95%CI 0.858 to 0.882), and had specificity of 0.851 (95%CI 0.844 to 0.858) and sensitivity of 0.716 (95%CI 0.676 to 0.755). Good predictive performance was stably achieved in different age subgroups and different gender subgroups. LA and NT-proBNP were the two most important variables for predicting paroxysmal and persistent AF in all models, except for the female subgroup aged less than 60 years.
Our model makes it possible to predict paroxysmal and persistent AF based on baseline data at admission. Early and individualized intervention strategies based on our model may help to improve clinical outcomes in AF patients.
除非进行心电图(ECG)观察,否则没有准确预测阵发性和持续性心房颤动(AF)亚型的有效方法。我们旨在开发一种使用机器学习算法识别阵发性和持续性AF的预测模型,并研究影响因素。
我们收集了所有纳入受试者的人口统计学数据、用药情况、血清学指标和基线心脏超声数据,共50个变量。AF亚型的诊断通过至少7天以上的ECG观察来确认。通过斯皮尔曼相关性分析、递归特征消除和最小绝对收缩和选择算子回归进行变量选择。我们使用三种机器学习方法建立了AF预测模型。最后,通过夏普利加法解释方法分析每个变量的重要性。
筛选后,我们发现了由10个变量组成的最佳变量集。我们建立的模型具有良好的预测性能(AUC = 0.870,95%CI 0.858至0.882),特异性为0.851(95%CI 0.844至0.858),敏感性为0.716(95%CI 0.676至0.755)。在不同年龄亚组和不同性别亚组中均稳定实现了良好的预测性能。除年龄小于60岁的女性亚组外,左心房(LA)和N末端B型利钠肽原(NT-proBNP)是所有模型中预测阵发性和持续性AF的两个最重要变量。
我们的模型使得基于入院时的基线数据预测阵发性和持续性AF成为可能。基于我们模型的早期个体化干预策略可能有助于改善AF患者的临床结局。