College of Engineering, Mathematics and Physical Sciences, University of Exeter, North Park Road, Exeter, EX4 4QF, UK.
Sci Rep. 2021 Feb 11;11(1):3647. doi: 10.1038/s41598-021-82808-x.
People often talk about the weather on social media, using different vocabulary to describe different conditions. Here we combine a large collection of wind-related Twitter posts (tweets) and UK Met Office wind speed observations to explore the relationship between tweet volume, tweet language and wind speeds in the UK. We find that wind speeds are experienced subjectively relative to the local baseline, so that the same absolute wind speed is reported as stronger or weaker depending on the typical weather conditions in the local area. Different linguistic tokens (words and emojis) are associated with different wind speeds. These associations can be used to create a simple text classifier to detect 'high-wind' tweets with reasonable accuracy; this can be used to detect high winds in a locality using only a single tweet. We also construct a 'social Beaufort scale' to infer wind speeds based only on the language used in tweets. Together with the classifier, this demonstrates that language alone is indicative of weather conditions, independent of tweet volume. However, the number of high-wind tweets shows a strong temporal correlation with local wind speeds, increasing the ability of a combined language-plus-volume system to successfully detect high winds. Our findings complement previous work in social sensing of weather hazards that has focused on the relationship between tweet volume and severity. These results show that impacts of wind and storms are found in how people communicate and use language, a novel dimension in understanding the social impacts of extreme weather.
人们经常在社交媒体上谈论天气,使用不同的词汇来描述不同的天气状况。在这里,我们结合了大量与风有关的 Twitter 帖子(推文)和英国气象局的风速观测数据,来探索推文数量、推文语言和英国风速之间的关系。我们发现,风速是相对于当地基准的主观感受,因此相同的绝对风速会根据当地的典型天气条件被报告为更强或更弱。不同的语言标记(单词和表情符号)与不同的风速相关联。这些关联可以用来创建一个简单的文本分类器,以相当高的准确度检测到“大风”推文;这可以仅使用单个推文来检测当地的大风。我们还构建了一个“社会蒲福风级”,仅根据推文使用的语言来推断风速。结合分类器,这表明仅语言就可以指示天气状况,而与推文数量无关。然而,大风推文的数量与当地风速之间存在很强的时间相关性,这增加了语言加数量系统成功检测大风的能力。我们的研究结果补充了以前关于天气危害的社会感知工作,这些工作主要集中在推文数量和严重程度之间的关系上。这些结果表明,风灾和风暴的影响可以在人们的交流和语言使用中找到,这是理解极端天气对社会影响的一个新维度。