School of computer engineering, KIIT University, Patia, Bhubaneswar, Odisha, 751024, India.
Department of Environmental Health, Harvard T H Chan School of public Health, 677 Harrington Avenue, Boston, MA, 02115, USA.
BMC Bioinformatics. 2024 Oct 15;25(1):329. doi: 10.1186/s12859-024-05866-8.
Stroke prediction remains a critical area of research in healthcare, aiming to enhance early intervention and patient care strategies. This study investigates the efficacy of machine learning techniques, particularly principal component analysis (PCA) and a stacking ensemble method, for predicting stroke occurrences based on demographic, clinical, and lifestyle factors. We systematically varied PCA components and implemented a stacking model comprising random forest, decision tree, and K-nearest neighbors (KNN).Our findings demonstrate that setting PCA components to 16 optimally enhanced predictive accuracy, achieving a remarkable 98.6% accuracy in stroke prediction. Evaluation metrics underscored the robustness of our approach in handling class imbalance and improving model performance, also comparative analyses against traditional machine learning algorithms such as SVM, logistic regression, and Naive Bayes highlighted the superiority of our proposed method.
中风预测仍然是医疗保健领域的一个关键研究领域,旨在加强早期干预和患者护理策略。本研究调查了机器学习技术,特别是主成分分析(PCA)和堆叠集成方法,基于人口统计学、临床和生活方式因素预测中风发生的效果。我们系统地改变了 PCA 成分,并实施了一个堆叠模型,其中包括随机森林、决策树和 K-最近邻(KNN)。我们的研究结果表明,将 PCA 成分设置为 16 可以最佳地提高预测准确性,中风预测的准确性达到了惊人的 98.6%。评估指标强调了我们的方法在处理类别不平衡和提高模型性能方面的稳健性,与传统机器学习算法(如 SVM、逻辑回归和朴素贝叶斯)的比较分析也突出了我们提出的方法的优越性。