School of Data Science, City University of Hong Kong, Hong Kong Special Administrative Region; Hong Kong Jockey Club Centre for Suicide Research and Prevention, The University of Hong Kong, Hong Kong Special Administrative Region.
Hong Kong Jockey Club Centre for Suicide Research and Prevention, The University of Hong Kong, Hong Kong Special Administrative Region.
Soc Sci Med. 2021 Aug;283:114176. doi: 10.1016/j.socscimed.2021.114176. Epub 2021 Jun 25.
Detecting users at risk of suicide in text-based counseling services is essential to ensure that at-risk individuals are flagged and prioritized.
The objective of this study is to develop a domain knowledge-aware risk assessment (KARA) model to improve our ability of suicide detection in online counseling systems.
We obtained the largest known de-identified dataset from an emotional support system established in Hong Kong, comprising 5682 Cantonese conversations between help-seekers and counselors. Of those, 682 conversations disclosed crisis intentions of suicide. We constructed a suicide-knowledge graph, representing suicide-related domain knowledge as a computer-processible graph. Such knowledge graph was embedded into a deep learning model to improve its ability to identify help-seekers in crisis. As the baseline, a standard NLP model was applied to the same task. 80% of the study samples were randomly sampled to train model parameters. The remaining 20% were used for model validation. Evaluation metrics including precision, recall, and c-statistic were reported.
Both KARA and the baseline achieved high precision (0.984 and 0.951, shown in Table 2) and high recall (0.942 and 0.947) towards non-crisis cases. For crisis cases, however, KARA model achieved a much higher recall than the baseline (0.870 vs 0.791). The c-statistics of KARA and the baseline were 0.815 and 0.760, respectively.
KARA significantly outperformed standard NLP models, demonstrating good translational value and clinical relevance.
在基于文本的咨询服务中检测有自杀风险的用户对于确保高危个体得到标记和优先处理至关重要。
本研究旨在开发一种领域知识感知风险评估(KARA)模型,以提高我们在在线咨询系统中进行自杀检测的能力。
我们从香港设立的情感支持系统中获取了迄今为止最大的、已知的去识别数据集,该系统包含 5682 名寻求帮助者和咨询师之间的粤语对话。其中,682 次对话透露了自杀危机意向。我们构建了一个自杀知识图谱,将自杀相关的领域知识表示为计算机可处理的图谱。该知识图谱被嵌入到深度学习模型中,以提高其识别处于危机中的寻求帮助者的能力。作为基线,我们应用了一个标准的自然语言处理模型来完成相同的任务。研究样本的 80%被随机抽样来训练模型参数。其余的 20%用于模型验证。报告的评估指标包括精度、召回率和 C 统计量。
KARA 和基线在非危机情况下均实现了较高的精度(0.984 和 0.951,见表 2)和召回率(0.942 和 0.947)。然而,对于危机情况,KARA 模型的召回率明显高于基线(0.870 与 0.791)。KARA 和基线的 C 统计量分别为 0.815 和 0.760。
KARA 显著优于标准的自然语言处理模型,展示了良好的转化价值和临床相关性。