Tao Zhu, Kokas Aynne, Zhang Rui, Cohan Daniel S, Wallach Dan
Department of Computer Science, Rice University, Houston, Texas, United States of America.
Department of Media Studies, University of Virginia, Charlottesville, Virginia, United States of America.
PLoS One. 2016 Sep 20;11(9):e0161389. doi: 10.1371/journal.pone.0161389. eCollection 2016.
Although studies have increasingly linked air pollution to specific health outcomes, less well understood is how public perceptions of air quality respond to changing pollutant levels. The growing availability of air pollution measurements and the proliferation of social media provide an opportunity to gauge public discussion of air quality conditions. In this paper, we consider particulate matter (PM) measurements from four Chinese megacities (Beijing, Shanghai, Guangzhou, and Chengdu) together with 112 million posts on Weibo (a popular Chinese microblogging system) from corresponding days in 2011-2013 to identify terms whose frequency was most correlated with PM levels. These correlations are used to construct an Air Discussion Index (ADI) for estimating daily PM based on the content of Weibo posts. In Beijing, the Chinese city with the most PM as measured by U.S. Embassy monitor stations, we found a strong correlation (R = 0.88) between the ADI and measured PM. In other Chinese cities with lower pollution levels, the correlation was weaker. Nonetheless, our results show that social media may be a useful proxy measurement for pollution, particularly when traditional measurement stations are unavailable, censored or misreported.
尽管研究越来越多地将空气污染与特定的健康结果联系起来,但公众对空气质量的认知如何随着污染物水平的变化而变化却鲜为人知。空气污染测量数据的日益可得以及社交媒体的激增,为衡量公众对空气质量状况的讨论提供了一个契机。在本文中,我们将来自中国四个特大城市(北京、上海、广州和成都)的颗粒物(PM)测量数据,与2011年至2013年相应日期在微博(一个广受欢迎的中国微博系统)上的1.12亿条帖子相结合,以确定那些频率与PM水平相关性最高的词汇。这些相关性被用于构建一个空气讨论指数(ADI),以便根据微博帖子的内容来估算每日的PM。在北京,根据美国大使馆监测站的测量,其PM含量在中国城市中最高,我们发现ADI与测量到的PM之间存在很强的相关性(R = 0.88)。在其他污染水平较低的中国城市,这种相关性较弱。尽管如此,我们的结果表明,社交媒体可能是一种有用的污染替代测量方法,特别是在传统测量站不可用、数据被审查或报告有误的情况下。