Faculty of Computing and Informatics, Multimedia University, Persiaran Multimedia, Cyberjaya 63100, Malaysia.
Int J Environ Res Public Health. 2022 Aug 19;19(16):10347. doi: 10.3390/ijerph191610347.
Suicide is a major public-health problem that exists in virtually every part of the world. Hundreds of thousands of people commit suicide every year. The early detection of suicidal ideation is critical for suicide prevention. However, there are challenges associated with conventional suicide-risk screening methods. At the same time, individuals contemplating suicide are increasingly turning to social media and online forums, such as Reddit, to express their feelings and share their struggles with suicidal thoughts. This prompted research that applies machine learning and natural language processing techniques to detect suicidality among social media and forum users. The objective of this paper is to investigate methods employed to detect suicidal ideations on the Reddit forum. To achieve this objective, we conducted a literature review of the recent articles detailing machine learning and natural language processing techniques applied to Reddit data to detect the presence of suicidal ideations. Following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines, we selected 26 recent studies, published between 2018 and 2022. The findings of the review outline the prevalent methods of data collection, data annotation, data preprocessing, feature engineering, model development, and evaluation. Furthermore, we present several Reddit-based datasets utilized to construct suicidal ideation detection models. Finally, we conclude by discussing the current limitations and future directions in the research of suicidal ideation detection.
自杀是一个存在于世界几乎每个角落的重大公共卫生问题。每年都有成百上千的人自杀。早期发现自杀意念对于预防自杀至关重要。然而,传统的自杀风险筛查方法存在挑战。与此同时,有自杀念头的人越来越多地转向社交媒体和在线论坛,如 Reddit,来表达他们的感受,并分享他们与自杀念头作斗争的经历。这促使研究人员应用机器学习和自然语言处理技术来检测社交媒体和论坛用户的自杀倾向。本文的目的是调查在 Reddit 论坛上检测自杀意念的方法。为了实现这一目标,我们对详细描述应用于 Reddit 数据以检测自杀意念存在的机器学习和自然语言处理技术的近期文章进行了文献回顾。根据系统评价和荟萃分析的首选报告项目指南,我们选择了 26 篇发表于 2018 年至 2022 年的近期研究。综述的结果概述了数据收集、数据标注、数据预处理、特征工程、模型开发和评估的常见方法。此外,我们还介绍了几个用于构建自杀意念检测模型的基于 Reddit 的数据集。最后,我们通过讨论自杀意念检测研究中的当前局限性和未来方向来结束讨论。