Rao Congjun, Wang Jing, Liu Ying, Yuan Jing
School of Mathematics and Statistics, Wuhan University of Technology, Wuhan, 430070, People's Republic of China.
Wuhan University of Technology Hospital, Wuhan University of Technology, Wuhan, 430070, People's Republic of China.
Sci Rep. 2025 Sep 2;15(1):32322. doi: 10.1038/s41598-025-18024-8.
Coronary artery disease (CAD) is a major public health concern, necessitating accurate risk factor identification. However, existing methods often suffer from feature preference bias and insufficient multidimensional evaluation, limiting their reliability. To address this, we propose a novel two-stage mechanism integrating multiple cross-filtering and binary cuckoo search (BCS). In the first stage, features are evaluated from three perspectives-relevance (chi-square test), information richness (mutual information), and distance (Fisher score)-to eliminate low-importance features and reduce bias. In the second stage, a random forest classifier optimizes feature selection via BCS, using classification accuracy as the objective function. Empirical analysis on the UCI CAD dataset demonstrates that our method achieves an accuracy of 89%, precision of 0.87, recall of 0.91, F1-score of 0.89, and AUC of 0.93. These values outperform existing homogeneous models (Method II, Method III, Method IV) by at least 3.49%, 3.57%, 3.41%, 3.49%, and 3.33%, respectively. The results highlight superior computational efficiency and predictive performance, making the mechanism a valuable tool for CAD risk assessment.
冠状动脉疾病(CAD)是一个重大的公共卫生问题,需要准确识别风险因素。然而,现有方法往往存在特征偏好偏差和多维评估不足的问题,限制了它们的可靠性。为了解决这个问题,我们提出了一种新颖的两阶段机制,该机制集成了多重交叉过滤和二进制布谷鸟搜索(BCS)。在第一阶段,从相关性(卡方检验)、信息丰富度(互信息)和距离(费舍尔评分)三个角度对特征进行评估,以消除低重要性特征并减少偏差。在第二阶段,随机森林分类器通过BCS优化特征选择,以分类准确率作为目标函数。对UCI CAD数据集的实证分析表明,我们的方法实现了89%的准确率、0.87的精确率、0.91的召回率、0.89的F1分数和0.93的AUC。这些值分别比现有的同类模型(方法二、方法三、方法四)至少高出3.49%、3.57%、3.41%、3.49%和3.33%。结果突出了卓越的计算效率和预测性能,使该机制成为CAD风险评估的一个有价值的工具。