Sawhney Ramit, Joshi Harshit, Nobles Alicia, Shah Rajiv Ratn
Netaji Subhas Institute of Technology.
University of Delhi.
Proc Int AAAI Conf Weblogs Soc Media. 2021 Jun 4;15:609-620. Epub 2021 May 22.
Social media platforms are already engaged in leveraging existing online socio-technical systems to employ just-in-time interventions for suicide prevention to the public. These efforts primarily rely on self-reports of potential self-harm content that is reviewed by moderators. Most recently, platforms have employed automated models to identify self-harm content, but acknowledge that these automated models still struggle to understand the nuance of human language (e.g., sarcasm). By explicitly focusing on Twitter posts that could easily be misidentified by a model as expressing suicidal intent (i.e., they contain similar phrases such as "wanting to die"), our work examines the temporal differences in historical expressions of general and emotional language prior to a clear expression of suicidal intent. Additionally, we analyze time-aware neural models that build on these language variants and factors in the historical, emotional spectrum of a user's tweeting activity. The strongest model achieves high (statistically significant) performance (macro F1=0.804, recall=0.813) to identify social media indicative of suicidal intent. Using three use cases of tweets with phrases common to suicidal intent, we qualitatively analyze and interpret how such models decided if suicidal intent was present and discuss how these analyses may be used to alleviate the burden on human moderators within the known constraints of how moderation is performed (e.g., no access to the user's timeline). Finally, we discuss the ethical implications of such data-driven models and inferences about suicidal intent from social media.
社交媒体平台已经在利用现有的在线社会技术系统,向公众提供即时自杀预防干预措施。这些努力主要依赖于版主对潜在自我伤害内容的自我报告进行审核。最近,平台采用了自动化模型来识别自我伤害内容,但也承认这些自动化模型仍难以理解人类语言的细微差别(例如讽刺)。通过明确关注那些可能被模型轻易误判为表达自杀意图的推特帖子(即它们包含类似“想死”这样的短语),我们的研究考察了在明确表达自杀意图之前,一般性语言和情感性语言的历史表达在时间上的差异。此外,我们分析了基于这些语言变体以及用户推特活动的历史情感频谱因素的时间感知神经模型。最强的模型在识别表明自杀意图的社交媒体方面取得了较高(具有统计学意义)的性能(宏F1 = 0.804,召回率 = 0.813)。通过使用三个包含自杀意图常见短语的推特用例,我们定性地分析和解释了此类模型是如何判定是否存在自杀意图的,并讨论了这些分析如何在已知的审核执行限制(例如无法访问用户的时间线)内减轻人工版主的负担。最后,我们讨论了此类数据驱动模型的伦理影响以及从社交媒体推断自杀意图的问题。