Lee Kyung Sang, Lee Hyewon, Myung Woojae, Song Gil-Young, Lee Kihwang, Kim Ho, Carroll Bernard J, Kim Doh Kwan
Department of Psychiatry, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea.
Department of Public Health Sciences, Graduate School of Public Health, Seoul National University, Seoul, Republic of Korea.
Psychiatry Investig. 2018 Apr;15(4):344-354. doi: 10.30773/pi.2017.10.15. Epub 2018 Apr 5.
Suicide is a significant public health concern worldwide. Social media data have a potential role in identifying high suicide risk individuals and also in predicting suicide rate at the population level. In this study, we report an advanced daily suicide prediction model using social media data combined with economic/meteorological variables along with observed suicide data lagged by 1 week.
The social media data were drawn from weblog posts. We examined a total of 10,035 social media keywords for suicide prediction. We made predictions of national suicide numbers 7 days in advance daily for 2 years, based on a daily moving 5-year prediction modeling period.
Our model predicted the likely range of daily national suicide numbers with 82.9% accuracy. Among the social media variables, words denoting economic issues and mood status showed high predictive strength. Observed number of suicides one week previously, recent celebrity suicide, and day of week followed by stock index, consumer price index, and sunlight duration 7 days before the target date were notable predictors along with the social media variables.
These results strengthen the case for social media data to supplement classical social/economic/climatic data in forecasting national suicide events.
自杀是全球范围内一个重大的公共卫生问题。社交媒体数据在识别高自杀风险个体以及预测人口层面的自杀率方面具有潜在作用。在本研究中,我们报告了一种先进的每日自杀预测模型,该模型使用社交媒体数据,并结合经济/气象变量以及滞后1周的观察到的自杀数据。
社交媒体数据取自博客文章。我们总共检查了10035个用于自杀预测的社交媒体关键词。基于每日移动的5年预测建模期,我们对2年中的每日全国自杀人数提前7天进行预测。
我们的模型预测每日全国自杀人数的可能范围,准确率达82.9%。在社交媒体变量中,表示经济问题和情绪状态的词汇显示出较高的预测强度。除社交媒体变量外,前一周观察到的自杀人数、近期名人自杀事件、星期几,以及目标日期前7天的股票指数、消费者价格指数和日照时长都是显著的预测因素。
这些结果进一步证明了社交媒体数据在预测全国自杀事件时补充经典社会/经济/气候数据的合理性。