Suppr超能文献

推特上与情绪相关的时空因素建模:综合分析及改善局部偏差识别的建议

Modeling Spatiotemporal Factors Associated With Sentiment on Twitter: Synthesis and Suggestions for Improving the Identification of Localized Deviations.

作者信息

Shah Zubair, Martin Paige, Coiera Enrico, Mandl Kenneth D, Dunn Adam G

机构信息

Centre for Health Informatics, Australian Institute for Health Innovation, Macquarie University, Sydney, Australia.

Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, United States.

出版信息

J Med Internet Res. 2019 May 8;21(5):e12881. doi: 10.2196/12881.

Abstract

BACKGROUND

Studies examining how sentiment on social media varies depending on timing and location appear to produce inconsistent results, making it hard to design systems that use sentiment to detect localized events for public health applications.

OBJECTIVE

The aim of this study was to measure how common timing and location confounders explain variation in sentiment on Twitter.

METHODS

Using a dataset of 16.54 million English-language tweets from 100 cities posted between July 13 and November 30, 2017, we estimated the positive and negative sentiment for each of the cities using a dictionary-based sentiment analysis and constructed models to explain the differences in sentiment using time of day, day of week, weather, city, and interaction type (conversations or broadcasting) as factors and found that all factors were independently associated with sentiment.

RESULTS

In the full multivariable model of positive (Pearson r in test data 0.236; 95% CI 0.231-0.241) and negative (Pearson r in test data 0.306; 95% CI 0.301-0.310) sentiment, the city and time of day explained more of the variance than weather and day of week. Models that account for these confounders produce a different distribution and ranking of important events compared with models that do not account for these confounders.

CONCLUSIONS

In public health applications that aim to detect localized events by aggregating sentiment across populations of Twitter users, it is worthwhile accounting for baseline differences before looking for unexpected changes.

摘要

背景

关于社交媒体上的情绪如何随时间和地点变化的研究似乎产生了不一致的结果,这使得设计利用情绪来检测公共卫生应用中的局部事件的系统变得困难。

目的

本研究的目的是衡量常见的时间和地点混杂因素如何解释推特上的情绪变化。

方法

我们使用了一个包含2017年7月13日至11月30日期间100个城市发布的1654万条英语推文的数据集,使用基于词典的情绪分析方法估计每个城市的积极和消极情绪,并构建模型,以一天中的时间、一周中的日期、天气、城市和互动类型(对话或广播)作为因素来解释情绪差异,发现所有因素都与情绪独立相关。

结果

在积极情绪(测试数据中的皮尔逊r为0.236;95%置信区间为0.231 - 0.241)和消极情绪(测试数据中的皮尔逊r为0.306;95%置信区间为0.301 - 0.310)的完整多变量模型中,城市和一天中的时间比天气和一周中的日期解释了更多的方差。与不考虑这些混杂因素的模型相比,考虑这些混杂因素的模型会产生不同的重要事件分布和排名。

结论

在旨在通过汇总推特用户群体的情绪来检测局部事件的公共卫生应用中,在寻找意外变化之前考虑基线差异是值得的。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验