Kellner Domenic, Lowin Maximilian, Hinz Oliver
Goethe University Frankfurt, Theodor-W.-Adorno-Platz 4, D-60629 Frankfurt am Main, Germany.
Decis Support Syst. 2023 Apr 24:113983. doi: 10.1016/j.dss.2023.113983.
Managing an extreme event like a healthcare disaster requires accurate information about the event's circumstances to comprehend the full consequences of acting. However, information quality is rarely optimal since it takes time to determine the information of relevance. The COVID-19 pandemic showed that even official data sources are far from optimal since they suffer from reporting delays that slow decision-making. To support decision-makers with timely information, we utilize data from online social networks to propose an adaptable information extraction solution to create indices helping to forecast COVID-19 case numbers and hospitalization rates. We show that combining heterogeneous data sources like Twitter and Reddit can leverage these sources' inherent complementarity and yield better predictions than those using a single data source alone. We further show that the predictions run ahead of the official COVID-19 incidences by up to 14 days. Additionally, we highlight the importance of model adjustments whenever new information becomes available or the underlying data changes by observing distinct changes in the presence of specific symptoms on Reddit.
应对像医疗灾难这样的极端事件需要有关该事件情况的准确信息,以便全面理解行动的后果。然而,信息质量很少能达到最佳状态,因为确定相关信息需要时间。新冠疫情表明,即使是官方数据来源也远非最佳,因为它们存在报告延迟,这会延缓决策。为了向决策者提供及时信息,我们利用在线社交网络的数据,提出一种适应性信息提取解决方案,以创建有助于预测新冠病例数和住院率的指标。我们表明,将推特和红迪网等异类数据源结合起来,可以利用这些来源固有的互补性,比单独使用单一数据源产生更好的预测结果。我们还表明,这些预测比官方新冠疫情发病率提前多达14天。此外,通过观察红迪网上特定症状出现情况的明显变化,我们强调了每当有新信息可用或基础数据发生变化时进行模型调整的重要性。