Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, GA 30322.
AMIA Jt Summits Transl Sci Proc. 2022 May 23;2022:313-322. eCollection 2022.
We investigated the utility of Twitter for conducting multi-faceted geolocation-centric pandemic surveillance, using India as an example. We collected over 4 million COVID19-related tweets related to the Indian outbreak between January and July 2021. We geolocated the tweets, applied natural language processing to characterize the tweets (eg., identifying symptoms and emotions), and compared tweet volumes with the numbers of confirmed COVID-19 cases. Tweet numbers closely mirrored the outbreak, with the 7-day average strongly correlated with confirmed COVID-19 cases nationally (Spearman r=0.944; p=0.001), and also at the state level (Spearman r=0.84, p=0.0003). Fatigue, Dyspnea and Cough were the top symptoms detected, while there was a significant increase in the proportion of tweets expressing negative emotions (eg., fear and sadness). The surge in COVID-19 tweets was followed by increased number of posts expressing concern about black fungus and oxygen supply. Our study illustrates the potential of social media for multi-faceted pandemic surveillance.
我们以印度为例,研究了 Twitter 在进行多方面以地理位置为中心的大流行监测方面的实用性。我们收集了 2021 年 1 月至 7 月期间与印度疫情相关的超过 400 万条 COVID19 相关推文。我们对这些推文进行了地理位置定位,并应用自然语言处理技术对推文进行了特征描述(例如,识别症状和情绪),然后将推文数量与确诊的 COVID-19 病例数量进行了比较。推文数量与疫情密切相关,7 天平均值与全国确诊 COVID-19 病例(Spearman r=0.944;p=0.001)以及州级水平(Spearman r=0.84;p=0.0003)均具有很强的相关性。疲劳、呼吸困难和咳嗽是检测到的主要症状,而表达负面情绪(如恐惧和悲伤)的推文比例显著增加。COVID-19 推文数量的激增之后,表达对黑真菌和氧气供应担忧的帖子数量也有所增加。我们的研究说明了社交媒体在多方面大流行监测方面的潜力。