Huang Chaoqun, Shu Shangzhi, Zhou Miaomiao, Sun Zhenming, Li Shuyan
Department of Cardiovascular Medicine, The First Bethune Hospital of Jilin University, Changchun, Jilin Province, China.
PLoS One. 2025 Jan 16;20(1):e0313562. doi: 10.1371/journal.pone.0313562. eCollection 2025.
Left atrial thrombus or spontaneous echo contrast (LAT/SEC) are widely recognized as significant contributors to cardiogenic embolism in non-valvular atrial fibrillation (NVAF). This study aimed to construct and validate an interpretable predictive model of LAT/SEC risk in NVAF patients using machine learning (ML) methods.
Electronic medical records (EMR) data of consecutive NVAF patients scheduled for catheter ablation at the First Hospital of Jilin University from October 1, 2022, to February 1, 2024, were analyzed. A retrospective study of 1,222 NVAF patients was conducted. Nine ML algorithms combined with demographic, clinical, and laboratory data were applied to develop prediction models for LAT/SEC in NVAF patients. Feature selection was performed using the least absolute shrinkage and selection operator (LASSO) and multivariate logistic regression. Multiple ML classification models were integrated to identify the optimal model, and Shapley Additive exPlanations (SHAP) interpretation was utilized for personalized risk assessment. Diagnostic performances of the optimal model and the CHA2DS2-VASc scoring system for predicting LAT/SEC risk in NVAF were compared.
Among 1,078 patients included, the incidence of LAT/SEC was 10.02%. Six independent predictors, including age, non-paroxysmal AF, diabetes, ischemic stroke or thromboembolism (IS/TE), hyperuricemia, and left atrial diameter (LAD), were identified as the most valuable features. The logistic classification model exhibited the best performance with an area under the receiver operating characteristic curve (AUC) of 0.850, accuracy of 0.812, sensitivity of 0.818, and specificity of 0.780 in the test set. SHAP analysis revealed the contribution of explanatory variables to the model and their relationship with LAT/SEC occurrence. The logistic regression model significantly outperformed the CHA2DS2-VASc scoring system, with AUCs of 0.831 and 0.650, respectively (Z = 7.175, P < 0.001).
ML proves to be a reliable tool for predicting LAT/SEC risk in NVAF patients. The constructed logistic regression model, along with SHAP interpretation, may serve as a clinically useful tool for identifying high-risk NVAF patients. This enables targeted diagnostic evaluations and the development of personalized treatment strategies based on the findings.
左心房血栓或自发显影(LAT/SEC)被广泛认为是导致非瓣膜性心房颤动(NVAF)心源性栓塞的重要因素。本研究旨在使用机器学习(ML)方法构建并验证NVAF患者LAT/SEC风险的可解释预测模型。
分析了2022年10月1日至2024年2月1日在吉林大学第一医院计划进行导管消融的连续NVAF患者的电子病历(EMR)数据。对1222例NVAF患者进行了回顾性研究。将九种ML算法与人口统计学、临床和实验室数据相结合,用于开发NVAF患者LAT/SEC的预测模型。使用最小绝对收缩和选择算子(LASSO)和多变量逻辑回归进行特征选择。整合多个ML分类模型以识别最佳模型,并利用Shapley加性解释(SHAP)解释进行个性化风险评估。比较了最佳模型和CHA2DS2-VASc评分系统预测NVAF患者LAT/SEC风险的诊断性能。
在纳入的1078例患者中,LAT/SEC的发生率为10.02%。六个独立预测因素,包括年龄、非阵发性房颤、糖尿病、缺血性中风或血栓栓塞(IS/TE)、高尿酸血症和左心房直径(LAD),被确定为最有价值的特征。逻辑分类模型表现最佳在测试集中,受试者工作特征曲线(AUC)下面积为0.850,准确率为0.812,敏感性为0.818,特异性为0.780。SHAP分析揭示了解释变量对模型的贡献及其与LAT/SEC发生的关系。逻辑回归模型明显优于CHA2DS2-VASc评分系统,AUC分别为0.831和0.650(Z = 7.175,P < 0.001)。
ML被证明是预测NVAF患者LAT/SEC风险的可靠工具所构建的逻辑回归模型以及SHAP解释可作为识别高危NVAF患者的临床有用工具。这有助于进行有针对性诊断评估,并根据结果制定个性化治疗策略。