Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA, USA; Informatics Program, Boston Children's Hospital, Boston, MA, USA.
Informatics Program, Boston Children's Hospital, Boston, MA, USA; Department of Statistics, Boston University, Boston, MA, USA.
Prev Med. 2019 Apr;121:86-93. doi: 10.1016/j.ypmed.2019.02.005. Epub 2019 Feb 8.
Air pollution is a serious public health concern. Innovative and scalable methods for detecting harmful air pollutants such as PM2.5 are necessary. This study assessed the feasibility of using social media to monitor outdoor air pollution in an urban area by comparing data from Twitter and validating it against established air monitoring stations. Data were collected from London, England from July 29, 2016 to March 17, 2017. Daily mean PM2.5 data was downloaded from the LondonAir platform consisting of 26 air pollution monitoring sites throughout Greater London. Publicly available tweets geo-located to Greater London containing air pollution terms were captured from the Twitter platform. Tweets with media URL links were excluded to minimize influence of news stories. Sentiment of the tweets was examined as negative, positive, or neutral. Cross-correlation analyses were used to compare the relationship between trends of tweets about air pollution and levels of PM2.5 over time. There were 16,448 tweets without a media URL link, with a mean of 498.42 (SD = 491.08) tweets per week. A significant cross-correlation coefficient of 0.803 was observed between PM2.5 data and the non-media air pollution tweets (p < 0.001). The cross-correlation coefficient was highest between PM2.5 data and air pollution tweets with negative sentiment at 0.816 (p < 0.001). Discussions about air pollution on Twitter reflect particle PM2.5 pollution levels in Greater London. This study highlights that social media may offer a supplemental source to support the detection and monitoring of air pollution in a densely populated urban area.
空气污染是一个严重的公共卫生问题。需要创新和可扩展的方法来检测 PM2.5 等有害空气污染物。本研究通过比较 Twitter 数据和经过验证的空气监测站数据,评估了使用社交媒体监测城市户外空气污染的可行性。数据于 2016 年 7 月 29 日至 2017 年 3 月 17 日从英国伦敦收集。从伦敦空气平台下载了伦敦市的每日平均 PM2.5 数据,该平台由大伦敦 26 个空气污染监测站组成。从 Twitter 平台上获取了在大伦敦地区包含空气污染术语的公开的地理标记推文。排除了具有媒体 URL 链接的推文,以最大程度地减少新闻报道的影响。检查了推文的情绪是负面、正面还是中性。使用交叉相关分析比较了随时间推移关于空气污染的推文趋势与 PM2.5 水平之间的关系。有 16448 条没有媒体 URL 链接的推文,平均每周有 498.42 条(SD=491.08)推文。PM2.5 数据和非媒体空气污染推文之间观察到显著的 0.803 交叉相关系数(p<0.001)。PM2.5 数据与具有负面情绪的空气污染推文之间的交叉相关系数最高,为 0.816(p<0.001)。关于空气污染的推文反映了大伦敦地区的 PM2.5 污染水平。本研究表明,社交媒体可能提供一种补充来源,以支持在人口密集的城市地区检测和监测空气污染。