Grinnell College, Grinnell, IA, USA.
University of Michigan, Ann Arbor, MI, USA.
BMC Bioinformatics. 2018 Jun 13;19(Suppl 8):211. doi: 10.1186/s12859-018-2197-z.
Suicide is an alarming public health problem accounting for a considerable number of deaths each year worldwide. Many more individuals contemplate suicide. Understanding the attributes, characteristics, and exposures correlated with suicide remains an urgent and significant problem. As social networking sites have become more common, users have adopted these sites to talk about intensely personal topics, among them their thoughts about suicide. Such data has previously been evaluated by analyzing the language features of social media posts and using factors derived by domain experts to identify at-risk users.
In this work, we automatically extract informal latent recurring topics of suicidal ideation found in social media posts. Our evaluation demonstrates that we are able to automatically reproduce many of the expertly determined risk factors for suicide. Moreover, we identify many informal latent topics related to suicide ideation such as concerns over health, work, self-image, and financial issues.
These informal topics topics can be more specific or more general. Some of our topics express meaningful ideas not contained in the risk factors and some risk factors do not have complimentary latent topics. In short, our analysis of the latent topics extracted from social media containing suicidal ideations suggests that users of these systems express ideas that are complementary to the topics defined by experts but differ in their scope, focus, and precision of language.
自杀是一个令人震惊的公共卫生问题,每年在全球范围内造成相当数量的死亡。还有更多的人考虑自杀。了解与自杀相关的属性、特征和暴露仍然是一个紧迫和重大的问题。随着社交网络的普及,用户已经开始在这些网站上谈论非常个人化的话题,包括他们对自杀的想法。这些数据以前是通过分析社交媒体帖子的语言特征,并使用领域专家得出的因素来评估的,以识别有自杀风险的用户。
在这项工作中,我们自动提取了社交媒体帖子中发现的自杀意念的非正式潜在重复主题。我们的评估表明,我们能够自动再现许多专家确定的自杀风险因素。此外,我们还确定了许多与自杀意念相关的非正式潜在主题,如对健康、工作、自我形象和财务问题的担忧。
这些非正式主题可以更具体或更一般。我们的一些主题表达了风险因素中没有包含的有意义的想法,而一些风险因素则没有补充的潜在主题。简而言之,我们对包含自杀意念的社交媒体中提取的潜在主题的分析表明,这些系统的用户表达的想法与专家定义的主题是互补的,但在范围、重点和语言精度上有所不同。