Department of Statistics, University of Rajshahi, Rajshahi 6205, Bangladesh.
Department of Statistics, Jatiya Kabi Kazi Nazrul Islam University, Mymensingh 2220, Bangladesh.
Diabetes Metab Syndr. 2021 May-Jun;15(3):877-884. doi: 10.1016/j.dsx.2021.03.035. Epub 2021 Apr 20.
Hypertension has become a major public health issue as the prevalence and risk of premature death and disability among adults due to hypertension has increased globally. The main objective is to characterize the risk factors of hypertension among adults in Bangladesh using machine learning (ML) algorithms.
The hypertension data was derived from Bangladesh demographic and health survey, 2017-18, which included 6965 people aged 35 and above. Two most promising risk factor identification methods, namely least absolute shrinkage operator (LASSO) and support vector machine recursive feature elimination (SVMRFE) are implemented to detect the critical risk factors of hypertension. Additionally, four well-known ML algorithms as artificial neural network, decision tree, random forest, and gradient boosting (GB) have been used to predict hypertension. Performance scores of these algorithms were evaluated by accuracy, precision, recall, F-measure, and area under the curve (AUC).
The results clarify that age, BMI, wealth index, working status, and marital status for LASSO and age, BMI, marital status, diabetes and region for SVMRFE appear to be the top-most five significant risk factors for hypertension. Our findings reveal that the combination of SVMRFE-GB gives the maximum accuracy (66.98%), recall (97.92%), F-measure (78.99%), and AUC (0.669) compared to others.
GB-based algorithm confirms the best performer for prediction of hypertension, at an early stage in Bangladesh. Therefore, this study highly suggests that the policymakers make proper judgments for controlling hypertension using SVMRFE-GB-based combination to save time and reduce cost for Bangladeshi adults.
随着全球范围内因高血压导致的成年人过早死亡和残疾风险的增加,高血压已成为一个主要的公共卫生问题。本研究的主要目的是使用机器学习(ML)算法来描述孟加拉国成年人高血压的危险因素。
高血压数据来源于 2017-18 年孟加拉国人口与健康调查,共包括 6965 名 35 岁及以上的成年人。本研究实施了两种最有前途的风险因素识别方法,即最小绝对收缩和选择算子(LASSO)和支持向量机递归特征消除(SVMRFE),以检测高血压的关键风险因素。此外,还使用了四种著名的 ML 算法,即人工神经网络、决策树、随机森林和梯度提升(GB)来预测高血压。通过准确性、精度、召回率、F 度量和曲线下面积(AUC)来评估这些算法的性能得分。
结果表明,LASSO 中年龄、BMI、财富指数、工作状态和婚姻状况,SVMRFE 中年龄、BMI、婚姻状况、糖尿病和地区是高血压的五个最重要的危险因素。我们的研究结果表明,SVMRFE-GB 的组合可提供最大的准确性(66.98%)、召回率(97.92%)、F 度量(78.99%)和 AUC(0.669)。
基于 GB 的算法在预测孟加拉国高血压方面表现最佳。因此,本研究强烈建议政策制定者使用 SVMRFE-GB 组合进行适当的判断,以控制高血压,从而为孟加拉国成年人节省时间和降低成本。