Suppr超能文献

基于语料库研究推特上的抑郁症状与心理社会压力源

Understanding Depressive Symptoms and Psychosocial Stressors on Twitter: A Corpus-Based Study.

作者信息

Mowery Danielle, Smith Hilary, Cheney Tyler, Stoddard Greg, Coppersmith Glen, Bryan Craig, Conway Mike

机构信息

Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, United States.

Department of Psychology, University of Utah, Salt Lake City, UT, United States.

出版信息

J Med Internet Res. 2017 Feb 28;19(2):e48. doi: 10.2196/jmir.6895.

Abstract

BACKGROUND

With a lifetime prevalence of 16.2%, major depressive disorder is the fifth biggest contributor to the disease burden in the United States.

OBJECTIVE

The aim of this study, building on previous work qualitatively analyzing depression-related Twitter data, was to describe the development of a comprehensive annotation scheme (ie, coding scheme) for manually annotating Twitter data with Diagnostic and Statistical Manual of Mental Disorders, Edition 5 (DSM 5) major depressive symptoms (eg, depressed mood, weight change, psychomotor agitation, or retardation) and Diagnostic and Statistical Manual of Mental Disorders, Edition IV (DSM-IV) psychosocial stressors (eg, educational problems, problems with primary support group, housing problems).

METHODS

Using this annotation scheme, we developed an annotated corpus, Depressive Symptom and Psychosocial Stressors Acquired Depression, the SAD corpus, consisting of 9300 tweets randomly sampled from the Twitter application programming interface (API) using depression-related keywords (eg, depressed, gloomy, grief). An analysis of our annotated corpus yielded several key results.

RESULTS

First, 72.09% (6829/9473) of tweets containing relevant keywords were nonindicative of depressive symptoms (eg, "we're in for a new economic depression"). Second, the most prevalent symptoms in our dataset were depressed mood and fatigue or loss of energy. Third, less than 2% of tweets contained more than one depression related category (eg, diminished ability to think or concentrate, depressed mood). Finally, we found very high positive correlations between some depression-related symptoms in our annotated dataset (eg, fatigue or loss of energy and educational problems; educational problems and diminished ability to think).

CONCLUSIONS

We successfully developed an annotation scheme and an annotated corpus, the SAD corpus, consisting of 9300 tweets randomly-selected from the Twitter application programming interface using depression-related keywords. Our analyses suggest that keyword queries alone might not be suitable for public health monitoring because context can change the meaning of keyword in a statement. However, postprocessing approaches could be useful for reducing the noise and improving the signal needed to detect depression symptoms using social media.

摘要

背景

重度抑郁症的终生患病率为16.2%,是美国疾病负担的第五大成因。

目的

本研究基于先前对抑郁症相关推特数据的定性分析,旨在描述一种综合注释方案(即编码方案)的开发过程,该方案用于依据《精神疾病诊断与统计手册》第五版(DSM-5)中的重度抑郁症状(如情绪低落、体重变化、精神运动性激越或迟缓)以及《精神疾病诊断与统计手册》第四版(DSM-IV)中的心理社会应激源(如教育问题、主要支持群体问题、住房问题)对手动注释推特数据。

方法

使用此注释方案,我们开发了一个注释语料库,即抑郁症状与心理社会应激源引发的抑郁症(SAD)语料库,它由9300条推文组成,这些推文是使用与抑郁症相关的关键词(如“depressed”“gloomy”“grief”)从推特应用程序编程接口(API)中随机抽取的。对我们的注释语料库进行分析得出了几个关键结果。

结果

首先,72.09%(6829/9473)包含相关关键词的推文未显示出抑郁症状(例如,“我们即将迎来新的经济衰退”)。其次,我们数据集中最常见的症状是情绪低落以及疲劳或精力丧失。第三,不到2%的推文包含不止一个与抑郁症相关的类别(例如,思考或集中注意力的能力下降、情绪低落)。最后,我们在注释数据集中发现一些与抑郁症相关的症状之间存在非常高的正相关性(例如,疲劳或精力丧失与教育问题;教育问题与思考能力下降)。

结论

我们成功开发了一种注释方案和一个注释语料库,即SAD语料库,它由9300条使用与抑郁症相关关键词从推特应用程序编程接口中随机选取的推文组成。我们的分析表明,仅靠关键词查询可能不适用于公共卫生监测,因为语境会改变语句中关键词的含义。然而,后处理方法可能有助于减少噪声并改善利用社交媒体检测抑郁症状所需的信号。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca50/5350450/33310ccad06a/jmir_v19i2e48_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验