Department of Psychology, University of Notre Dame, Notre Dame, Indiana.
Suicide Life Threat Behav. 2021 Feb;51(1):55-64. doi: 10.1111/sltb.12668.
Text-based responses may provide significant contributions to suicide risk prediction, yet research including text data is limited. This may be due to a lack of exposure and familiarity with statistical analyses for this data structure.
The current study provides an overview of data processing and statistical algorithms for text data, guided by an empirical example of 947 online participants who completed both open-ended items and traditional self-report measures. We give an introduction to a number of text-based statistical approaches, including dictionary-based methods, topic modeling, word embeddings, and deep learning.
We analyze responses from the open-ended question "How do you feel today?", detailing characteristics of the responses, as well as predicting past-year suicidal ideation.
We see the analysis of text from social media, open-ended questions, and other text sources (i.e., medical records) as an important form of complementary assessment to traditional scales, shedding insight on what we are missing in our current set of questionnaires, which may ultimately serve to improve both our understanding and prediction of suicide.
基于文本的回复可能对自杀风险预测有重要贡献,但包含文本数据的研究却很有限。这可能是由于缺乏对这种数据结构的统计分析的了解和熟悉。
本研究通过对 947 名在线参与者的实证案例(他们既完成了开放式项目,也完成了传统的自我报告措施),为文本数据的处理和统计算法提供了概述。我们介绍了一些基于文本的统计方法,包括基于字典的方法、主题建模、词嵌入和深度学习。
我们分析了开放式问题“你今天感觉如何?”的回复,详细描述了回复的特征,并预测了过去一年的自杀意念。
我们认为分析社交媒体、开放式问题和其他文本来源(即医疗记录)中的文本是对传统量表的重要补充评估形式,深入了解我们在当前问卷集中遗漏的内容,这最终可能有助于提高我们对自杀的理解和预测。