Center on Continuum of Care in Addictions, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
Positive Psychology Center, Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
PLoS One. 2018 Apr 4;13(4):e0194290. doi: 10.1371/journal.pone.0194290. eCollection 2018.
The current study analyzes a large set of Twitter data from 1,384 US counties to determine whether excessive alcohol consumption rates can be predicted by the words being posted from each county.
Data from over 138 million county-level tweets were analyzed using predictive modeling, differential language analysis, and mediating language analysis.
Twitter language data captures cross-sectional patterns of excessive alcohol consumption beyond that of sociodemographic factors (e.g. age, gender, race, income, education), and can be used to accurately predict rates of excessive alcohol consumption. Additionally, mediation analysis found that Twitter topics (e.g. 'ready gettin leave') can explain much of the variance associated between socioeconomics and excessive alcohol consumption.
Twitter data can be used to predict public health concerns such as excessive drinking. Using mediation analysis in conjunction with predictive modeling allows for a high portion of the variance associated with socioeconomic status to be explained.
本研究分析了来自美国 1384 个县的大量 Twitter 数据,以确定每个县发布的文字是否可以预测过度饮酒率。
使用预测建模、差异语言分析和中介语言分析对超过 1.38 亿条县级推文的数据进行了分析。
Twitter 语言数据捕捉到了过度饮酒的横断面模式,超出了社会人口因素(如年龄、性别、种族、收入、教育)的范围,并且可以准确预测过度饮酒率。此外,中介分析发现,Twitter 主题(例如“准备离开”)可以解释社会经济状况与过度饮酒之间的大部分差异。
Twitter 数据可用于预测公众健康问题,如过度饮酒。使用中介分析与预测建模相结合,可以解释与社会经济地位相关的大部分差异。