Contreras Diana, Wilkinson Sean, Alterman Evangeline, Hervás Javier
Centre for Resilience and Environmental Change (CHANGING), School of Earth and Environmental Sciences, College of Physical Sciences and Engineering, Cardiff University, Main Building, Park Place, Cardiff, CF10 3AT UK.
Learning from Earthquakes (LfE), School of Engineering, Faculty of Science, Agriculture and Engineering, Newcastle University, 2nd Floor Drummond Building, Newcastle, NE1 7RU UK.
Nat Hazards (Dordr). 2022;113(1):403-421. doi: 10.1007/s11069-022-05307-w. Epub 2022 Mar 23.
Traditionally, earthquake impact assessments have been made via fieldwork by non-governmental organisations (NGO's) sponsored data collection; however, this approach is time-consuming, expensive and often limited. Recently, social media (SM) has become a valuable tool for quickly collecting large amounts of first-hand data after a disaster and shows great potential for decision-making. Nevertheless, extracting meaningful information from SM is an ongoing area of research. This paper tests the accuracy of the pre-trained sentiment analysis (SA) model developed by the no-code machine learning platform MonkeyLearn using the text data related to the emergency response and early recovery phase of the three major earthquakes that struck Albania on the 26th November 2019. These events caused 51 deaths, 3000 injuries and extensive damage. We obtained 695 tweets with the hashtags: #Albania #AlbanianEarthquake, and #albanianearthquake from the 26th November 2019 to the 3rd February 2020. We used these data to test the accuracy of the pre-trained SA classification model developed by MonkeyLearn to identify polarity in text data. This test explores the feasibility to automate the classification process to extract meaningful information from text data from SM in real-time in the future. We tested the no-code machine learning platform's performance using a confusion matrix. We obtained an overall accuracy (ACC) of 63% and a misclassification rate of 37%. We conclude that the ACC of the unsupervised classification is sufficient for a preliminary assessment, but further research is needed to determine if the accuracy is improved by customising the training model of the machine learning platform.
传统上,地震影响评估是由非政府组织(NGO)通过实地考察赞助的数据收集来进行的;然而,这种方法耗时、昂贵且往往具有局限性。最近,社交媒体(SM)已成为灾难后快速收集大量第一手数据的宝贵工具,并在决策方面显示出巨大潜力。尽管如此,从社交媒体中提取有意义的信息仍是一个正在进行研究的领域。本文使用与2019年11月26日阿尔巴尼亚发生的三次大地震的应急响应和早期恢复阶段相关的文本数据,测试了无代码机器学习平台MonkeyLearn开发的预训练情感分析(SA)模型的准确性。这些地震造成51人死亡、3000人受伤和广泛破坏。我们从2019年11月26日至2020年2月3日获取了695条带有#阿尔巴尼亚#阿尔巴尼亚地震和#阿尔巴尼亚地震标签的推文。我们使用这些数据来测试MonkeyLearn开发的预训练SA分类模型在文本数据中识别极性的准确性。该测试探索了未来实时从社交媒体文本数据中自动分类过程以提取有意义信息的可行性。我们使用混淆矩阵测试了无代码机器学习平台的性能。我们获得了63%的总体准确率(ACC)和37%的误分类率。我们得出结论,无监督分类的ACC足以进行初步评估,但需要进一步研究以确定通过定制机器学习平台的训练模型是否能提高准确性。