Baydili İsmail, Tasci Burak, Tasci Gülay
Department of Audiovisual Techniques and Media Production, Vocational School of Technical Sciences, Fırat University, Elazig 23119, Turkey.
Vocational School of Technical Sciences, Firat University, Elazig 23119, Turkey.
Behav Sci (Basel). 2025 Mar 12;15(3):352. doi: 10.3390/bs15030352.
Social media has become an essential platform for understanding human behavior, particularly in relation to mental health conditions such as depression and suicidal tendencies. Given the increasing reliance on digital communication, the ability to automatically detect individuals at risk through their social media activity holds significant potential for early intervention and mental health support. This study proposes a machine learning-based framework that integrates pre-trained language models and advanced feature selection techniques to improve the detection of depression and suicidal tendencies from social media data. We utilize six diverse datasets, collected from platforms such as Twitter and Reddit, ensuring a broad evaluation of model robustness. The proposed methodology incorporates Cumulative Weight-based Iterative Neighborhood Component Analysis (CWINCA) for feature selection and Support Vector Machines (SVMs) for classification. The results indicate that the model achieves high accuracy across multiple datasets, ranging from 80.74% to 99.96%, demonstrating its effectiveness in identifying risk factors associated with mental health issues. These findings highlight the potential of social media-based automated detection methods as complementary tools for mental health professionals. Future work will focus on real-time detection capabilities and multilingual adaptation to enhance the practical applicability of the proposed approach.
社交媒体已成为理解人类行为的重要平台,尤其是在与抑郁症和自杀倾向等心理健康状况相关的方面。鉴于对数字通信的依赖日益增加,通过社交媒体活动自动检测处于风险中的个体的能力对于早期干预和心理健康支持具有巨大潜力。本研究提出了一个基于机器学习的框架,该框架集成了预训练语言模型和先进的特征选择技术,以改进从社交媒体数据中检测抑郁症和自杀倾向的能力。我们使用了六个不同的数据集,这些数据集是从Twitter和Reddit等平台收集的,以确保对模型稳健性进行广泛评估。所提出的方法采用基于累积权重的迭代邻域成分分析(CWINCA)进行特征选择,并使用支持向量机(SVM)进行分类。结果表明,该模型在多个数据集上都达到了较高的准确率,范围从80.74%到99.96%,证明了其在识别与心理健康问题相关的风险因素方面的有效性。这些发现凸显了基于社交媒体的自动检测方法作为心理健康专业人员辅助工具的潜力。未来的工作将集中在实时检测能力和多语言适应性方面,以提高所提出方法的实际适用性。