Al-Mamun Firoj, Mamun Mohammed A, Hasan Md Emran, ALmerab Moneerah Mohammad, Islam Johurul, Muhit Mohammad
CHINTA Research Bangladesh, Savar, Dhaka, Bangladesh.
Department of Public Health & Informatics, Jahangirnagar University, Savar, Dhaka, Bangladesh.
BJPsych Open. 2025 May 12;11(3):e96. doi: 10.1192/bjo.2025.47.
Mental health conditions, particularly depression and anxiety, are highly prevalent and impose substantial health burdens globally. Despite advancements in machine learning, there is limited application of these methods in predicting common mental illnesses within community populations in low-resource settings.
This study aims to examine the prevalence and associated risk factors of common mental illnesses collectively (depression and anxiety) in a rural Bangladeshi community using machine learning models.
This cross-sectional study surveyed 490 adults aged 18-59 in a rural Bangladeshi community. Depression and anxiety were assessed using the Patient Health Questionnaire (PHQ-2) and Generalised Anxiety Disorder (GAD-2) scales. Machine learning models, including Categorical Boosting, the support vector machine, the random forest and XGBoost (eXtreme Gradient Boosting), were trained on 80% of the data-set and tested on 20% to evaluate predictive accuracy, precision, F1 score, log-loss and area under the receiver operating characteristic curve (AUC-ROC).
Some 20.4% of participants experienced at least one common mental illness. Feature importance analysis identified house type, age group and educational status as the most significant predictors. SHAP (Shapley Additive exPlanations) values highlighted their influence on model outputs, and the XGBoost gain metric confirmed the importance of marital status and house type, with gains of 0.76 and 0.73, respectively. XGBoost delivered the best performance, achieving an F1 score of 71.01%, precision of 71.58%, accuracy of 71.15% and the lowest log-loss value of 0.56. The random forest had an accuracy of 78.21% and an AUC-ROC of 0.90.
The findings of this study suggest targeted interventions addressing housing and social determinants could improve mental health outcomes in similar rural settings. Further studies should consider longitudinal data to explore causal relationships.
心理健康状况,尤其是抑郁症和焦虑症,在全球范围内极为普遍,并带来了沉重的健康负担。尽管机器学习取得了进展,但这些方法在低资源环境下的社区人群中预测常见精神疾病的应用仍然有限。
本研究旨在使用机器学习模型,调查孟加拉国农村社区中常见精神疾病(抑郁症和焦虑症)的总体患病率及相关风险因素。
这项横断面研究对孟加拉国一个农村社区的490名18至59岁成年人进行了调查。使用患者健康问卷(PHQ-2)和广泛性焦虑障碍(GAD-2)量表评估抑郁症和焦虑症。包括分类提升、支持向量机、随机森林和XGBoost(极端梯度提升)在内的机器学习模型在80%的数据集上进行训练,并在20%的数据上进行测试,以评估预测准确性、精确率、F1分数、对数损失和受试者工作特征曲线下面积(AUC-ROC)。
约20.4%的参与者至少患有一种常见精神疾病。特征重要性分析确定房屋类型、年龄组和教育程度是最重要的预测因素。SHAP(Shapley加性解释)值突出了它们对模型输出的影响,XGBoost增益指标证实了婚姻状况和房屋类型的重要性,增益分别为0.76和0.73。XGBoost表现最佳,F1分数达到71.01%,精确率为71.58%,准确率为71.15%,对数损失值最低,为0.56。随机森林的准确率为78.21%,AUC-ROC为0.90。
本研究结果表明,针对住房和社会决定因素的有针对性干预措施可以改善类似农村地区的心理健康状况。进一步的研究应考虑使用纵向数据来探索因果关系。