Liu Hong, Lu Lifeng, Xiong Honglin, Fan Chongjun, Fan Lumin, Lin Ziqian, Zhang Hongliu
Business School, University of Shanghai for Science and Technology, Shanghai 200093, China.
Chongming Hospital, Shanghai University of Medicine & Health Sciences, Shanghai 202150, China.
Diagnostics (Basel). 2024 May 30;14(11):1145. doi: 10.3390/diagnostics14111145.
This investigation sought to discern the risk factors for atrial fibrillation within Shanghai's Chongming District, analyzing data from 678 patients treated at a tertiary hospital in Chongming District, Shanghai, from 2020 to 2023, collecting information on season, C-reactive protein, hypertension, platelets, and other relevant indicators. The researchers introduced a novel dual feature-selection methodology, combining hierarchical clustering with Fisher scores (HC-MFS), to benchmark against four established methods. Through the training of five classification models on a designated dataset, the most effective model was chosen for method performance evaluation, with validation confirmed by test set scores. Impressively, the HC-MFS approach achieved the highest accuracy and the lowest root mean square error in the classification model, at 0.9118 and 0.2970, respectively. This provides a higher performance compared to existing methods, thanks to the combination and interaction of the two methods, which improves the quality of the feature subset. The research identified seasonal changes that were strongly associated with atrial fibrillation (pr = 0.31, FS = 0.11, and DCFS = 0.33, ranked first in terms of correlation); LDL cholesterol, total cholesterol, C-reactive protein, and platelet count, which are associated with inflammatory response and coronary heart disease, also indirectly contribute to atrial fibrillation and are risk factors for AF. Conclusively, this study advocates that machine-learning models can significantly aid clinicians in diagnosing individuals predisposed to atrial fibrillation, which shows a strong correlation with both pathological and climatic elements, especially seasonal variations, in the Chongming District.
本研究旨在识别上海市崇明区房颤的风险因素,分析了2020年至2023年在上海崇明区一家三级医院接受治疗的678例患者的数据,收集了季节、C反应蛋白、高血压、血小板及其他相关指标的信息。研究人员引入了一种新颖的双重特征选择方法,即层次聚类与Fisher评分相结合(HC-MFS),并与四种既定方法进行对比。通过在指定数据集上训练五个分类模型,选择最有效的模型进行方法性能评估,并通过测试集分数确认验证结果。令人印象深刻的是,HC-MFS方法在分类模型中分别达到了最高准确率0.9118和最低均方根误差0.2970。由于这两种方法的结合与相互作用提高了特征子集的质量,与现有方法相比,该方法具有更高的性能。研究发现季节变化与房颤密切相关(pr = 0.31,FS = 0.11,DCFS = 0.33,在相关性方面排名第一);与炎症反应和冠心病相关的低密度脂蛋白胆固醇、总胆固醇、C反应蛋白和血小板计数也间接导致房颤,是房颤的危险因素。总之,本研究主张机器学习模型可以显著帮助临床医生诊断易患房颤的个体,在崇明地区,房颤与病理因素和气候因素,尤其是季节变化密切相关。