Division of Violence Prevention, National Center for Injury Prevention and Control, U.S. Centers for Disease Control and Prevention (CDC), Atlanta, GA, USA.
Office of Strategy and Innovation, National Center for Injury Prevention and Control, U.S. Centers for Disease Control and Prevention (CDC), Atlanta, GA, USA.
J Ment Health. 2020 Apr;29(2):234-241. doi: 10.1080/09638237.2020.1739251. Epub 2020 Mar 30.
Upstream public health indicators of poor mental health in the United States (U.S.) are currently measured by national telephone-based surveys; however, results are delayed by 1-2 years, limiting real-time assessment of trends. The aim of this study was to evaluate associations between conversational topics on Twitter from 2018 to 2019 and mental distress rates from 2017 to 2018 for the 50 U.S. states and capital. We used a novel lexicon, Empath, to examine conversational topics from aggregate social media messages from Twitter that correlated most strongly with official U.S. state-level rates of mental distress from the Behavioral Risk Factor Surveillance System. The ten lexical categories most positively correlated with rates of frequent mental distress at the state-level included categories about death, illness, or injury. Lexical categories most inversely correlated with mental distress included categories that serve as proxies for economic prosperity and industry. Using the prevalence of the 10 most positively and 10 most negatively correlated lexical categories to predict state-level rates of mental distress via a linear regression model on an independent sample of data yielded estimates that were moderately similar to actual rates (mean difference = 0.52%; Pearson correlation = 0.45, < 0.001). This work informs efforts to use social media to measure population-level trends in mental health.
美国(U.S.)目前通过全国范围内的电话调查来衡量心理健康状况不良的上游公共卫生指标;然而,结果会延迟 1-2 年,限制了对趋势的实时评估。本研究的目的是评估 2018 年至 2019 年期间来自 Twitter 的对话主题与 2017 年至 2018 年来自美国 50 个州和首府的精神困扰率之间的关联。我们使用了一种新颖的词汇表 Empath,来分析来自 Twitter 的社交媒体信息中的对话主题,这些主题与来自行为风险因素监测系统的美国官方州级精神困扰率最相关。与州级频繁精神困扰率呈最正相关的十个词汇类别包括关于死亡、疾病或伤害的类别。与精神困扰呈最负相关的词汇类别包括作为经济繁荣和行业代理的类别。通过对独立样本数据进行线性回归模型分析,使用与州级精神困扰率呈最正相关和最负相关的十个词汇类别来预测州级精神困扰率,预测结果与实际率相当(平均差异=0.52%;皮尔逊相关=0.45, < 0.001)。这项工作为利用社交媒体衡量心理健康人群水平趋势提供了信息。