特征分组划分：一种使用机器学习算法进行抑郁严重程度预测和类别平衡的方法。

Feature group partitioning: an approach for depression severity prediction with class balancing using machine learning algorithms.

机构信息

Department of Computer Science and Engineering, Dhaka University of Engineering & Technology, Gazipur, 1707, Bangladesh.

Department of Computer Science and Engineering, Bangabandhu Sheikh Mujibur Rahman Science & Technology University, Gopalganj, 8100, Bangladesh.

出版信息

BMC Med Res Methodol. 2024 Jun 3;24(1):123. doi: 10.1186/s12874-024-02249-8.

DOI:10.1186/s12874-024-02249-8

PMID:38831346

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11145774/

Abstract

In contemporary society, depression has emerged as a prominent mental disorder that exhibits exponential growth and exerts a substantial influence on premature mortality. Although numerous research applied machine learning methods to forecast signs of depression. Nevertheless, only a limited number of research have taken into account the severity level as a multiclass variable. Besides, maintaining the equality of data distribution among all the classes rarely happens in practical communities. So, the inevitable class imbalance for multiple variables is considered a substantial challenge in this domain. Furthermore, this research emphasizes the significance of addressing class imbalance issues in the context of multiple classes. We introduced a new approach Feature group partitioning (FGP) in the data preprocessing phase which effectively reduces the dimensionality of features to a minimum. This study utilized synthetic oversampling techniques, specifically Synthetic Minority Over-sampling Technique (SMOTE) and Adaptive Synthetic (ADASYN), for class balancing. The dataset used in this research was collected from university students by administering the Burn Depression Checklist (BDC). For methodological modifications, we implemented heterogeneous ensemble learning stacking, homogeneous ensemble bagging, and five distinct supervised machine learning algorithms. The issue of overfitting was mitigated by evaluating the accuracy of the training, validation, and testing datasets. To justify the effectiveness of the prediction models, balanced accuracy, sensitivity, specificity, precision, and f1-score indices are used. Overall, comprehensive analysis demonstrates the discrimination between the Conventional Depression Screening (CDS) and FGP approach. In summary, the results show that the stacking classifier for FGP with SMOTE approach yields the highest balanced accuracy, with a rate of 92.81%. The empirical evidence has demonstrated that the FGP approach, when combined with the SMOTE, able to produce better performance in predicting the severity of depression. Most importantly the optimization of the training time of the FGP approach for all of the classifiers is a significant achievement of this research.

摘要

在当代社会，抑郁症已成为一种突出的精神障碍，呈指数级增长，并对过早死亡产生重大影响。尽管许多研究都应用了机器学习方法来预测抑郁迹象，但只有少数研究将严重程度视为多类变量。此外，在实际社区中，很少能保持所有类别数据分布的平等。因此，多变量的必然类别不平衡是该领域的一个重大挑战。此外，本研究强调在多类情况下解决类别不平衡问题的重要性。我们在数据预处理阶段引入了一种新的方法 Feature group partitioning（FGP），它可以有效地将特征的维数降低到最低。本研究利用了合成过采样技术，特别是 Synthetic Minority Over-sampling Technique（SMOTE）和 Adaptive Synthetic（ADASYN），来实现类别平衡。本研究使用的数据集是通过向大学生发放 Burn Depression Checklist（BDC）收集的。为了进行方法学修改，我们实施了异构集成学习堆叠、同质集成袋装和五种不同的监督机器学习算法。通过评估训练、验证和测试数据集的准确性来减轻过拟合问题。为了证明预测模型的有效性，使用了平衡准确性、敏感性、特异性、精度和 f1 分数指数。总的来说，综合分析证明了常规抑郁筛查（CDS）和 FGP 方法之间的区别。总之，结果表明，采用 SMOTE 方法的 FGP 堆叠分类器的平衡准确性最高，为 92.81%。实证证据表明，FGP 方法与 SMOTE 结合使用，可以在预测抑郁严重程度方面产生更好的性能。最重要的是，优化所有分类器的 FGP 方法的训练时间是这项研究的一项重大成就。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/46b7/11145774/8ca70dfb329b/12874_2024_2249_Fig1_HTML.jpg

相似文献

Feature group partitioning: an approach for depression severity prediction with class balancing using machine learning algorithms.

BMC Med Res Methodol. 2024 Jun 3;24(1):123. doi: 10.1186/s12874-024-02249-8.

Joint modeling strategy for using electronic medical records data to build machine learning models: an example of intracerebral hemorrhage.

BMC Med Inform Decis Mak. 2022 Oct 25;22(1):278. doi: 10.1186/s12911-022-02018-x.

RDET stacking classifier: a novel machine learning based approach for stroke prediction using imbalance data.

PeerJ Comput Sci. 2023 Nov 21;9:e1684. doi: 10.7717/peerj-cs.1684. eCollection 2023.

An efficient ensemble based machine learning approach for predicting Chronic Kidney Disease.

Curr Med Imaging. 2023 May 8. doi: 10.2174/1573405620666230508104538.

BoostedEnML: Efficient Technique for Detecting Cyberattacks in IoT Systems Using Boosted Ensemble Machine Learning.

Sensors (Basel). 2022 Sep 29;22(19):7409. doi: 10.3390/s22197409.

A Machine Learning Approach for Drug-target Interaction Prediction using Wrapper Feature Selection and Class Balancing.

Mol Inform. 2020 May;39(5):e1900062. doi: 10.1002/minf.201900062. Epub 2020 Feb 11.

Effective treatment of imbalanced datasets in health care using modified SMOTE coupled with stacked deep learning algorithms.

Appl Nanosci. 2023;13(3):1829-1840. doi: 10.1007/s13204-021-02063-4. Epub 2022 Feb 3.

Social Reminiscence in Older Adults' Everyday Conversations: Automated Detection Using Natural Language Processing and Machine Learning.

J Med Internet Res. 2020 Sep 15;22(9):e19133. doi: 10.2196/19133.

Predicting diabetes in adults: identifying important features in unbalanced data over a 5-year cohort study using machine learning algorithm.

BMC Med Res Methodol. 2024 Sep 27;24(1):220. doi: 10.1186/s12874-024-02341-z.

Ensemble stacking rockburst prediction model based on Yeo-Johnson, K-means SMOTE, and optimal rockburst feature dimension determination.

Sci Rep. 2022 Sep 12;12(1):15352. doi: 10.1038/s41598-022-19669-5.

引用本文的文献

The Impact of Quality of Life on Cardiac Arrhythmias: A Clinical, Demographic, and AI-Assisted Statistical Investigation.

Diagnostics (Basel). 2025 Mar 27;15(7):856. doi: 10.3390/diagnostics15070856.

Predictive and Explainable Artificial Intelligence for Neuroimaging Applications.

Diagnostics (Basel). 2024 Oct 27;14(21):2394. doi: 10.3390/diagnostics14212394.

Explainable Multi-Layer Dynamic Ensemble Framework Optimized for Depression Detection and Severity Assessment.

Diagnostics (Basel). 2024 Oct 25;14(21):2385. doi: 10.3390/diagnostics14212385.

本文引用的文献

A Comparative Study of the Use of Stratified Cross-Validation and Distribution-Balanced Stratified Cross-Validation in Imbalanced Learning.

Sensors (Basel). 2023 Feb 20;23(4):2333. doi: 10.3390/s23042333.

Mental health interventions in adolescence.

Curr Opin Psychol. 2022 Dec;48:101492. doi: 10.1016/j.copsyc.2022.101492. Epub 2022 Oct 15.

Depression and suicidality among Bangladeshi students: Subject selection reasons and learning environment as potential risk factors.

Perspect Psychiatr Care. 2021 Jul;57(3):1150-1162. doi: 10.1111/ppc.12670. Epub 2020 Nov 2.

AI in mental health.

Curr Opin Psychol. 2020 Dec;36:112-117. doi: 10.1016/j.copsyc.2020.04.005. Epub 2020 Jun 3.

Depression, anxiety and stress in different subgroups of first-year university students from 4-year cohort data.

J Affect Disord. 2020 Sep 1;274:305-314. doi: 10.1016/j.jad.2020.05.041. Epub 2020 May 22.

Depression and disclosure behavior via social media: A study of university students in China.

Heliyon. 2020 Feb 14;6(2):e03368. doi: 10.1016/j.heliyon.2020.e03368. eCollection 2020 Feb.

Machine Learning in Psychometrics and Psychological Research.

Front Psychol. 2020 Jan 10;10:2970. doi: 10.3389/fpsyg.2019.02970. eCollection 2019.

Stacked generalization: an introduction to super learning.

Eur J Epidemiol. 2018 May;33(5):459-464. doi: 10.1007/s10654-018-0390-z. Epub 2018 Apr 10.

Demography and Risk Factors of Suicide in Bangladesh: A Six-Month Paper Content Analysis.

Psychiatry J. 2017;2017:3047025. doi: 10.1155/2017/3047025. Epub 2017 Oct 10.

Depression and cardiovascular disorders.

Annu Rev Clin Psychol. 2013;9:327-54. doi: 10.1146/annurev-clinpsy-050212-185526.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

特征分组划分：一种使用机器学习算法进行抑郁严重程度预测和类别平衡的方法。

Feature group partitioning: an approach for depression severity prediction with class balancing using machine learning algorithms.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献