Suppr超能文献

一种用于新冠病毒社交媒体信息的新型数据集成框架。

A new data integration framework for Covid-19 social media information.

机构信息

School of Engineering, Computing and Mathematics, University of Plymouth, Plymouth, PL48AA, UK.

出版信息

Sci Rep. 2023 Apr 15;13(1):6170. doi: 10.1038/s41598-023-33141-y.

Abstract

The Covid-19 pandemic presents a serious threat to people's health, resulting in over 250 million confirmed cases and over 5 million deaths globally. To reduce the burden on national health care systems and to mitigate the effects of the outbreak, accurate modelling and forecasting methods for short- and long-term health demand are needed to inform government interventions aiming at curbing the pandemic. Current research on Covid-19 is typically based on a single source of information, specifically on structured historical pandemic data. Other studies are exclusively focused on unstructured online retrieved insights, such as data available from social media. However, the combined use of structured and unstructured information is still uncharted. This paper aims at filling this gap, by leveraging historical and social media information with a novel data integration methodology. The proposed approach is based on vine copulas, which allow us to exploit the dependencies between different sources of information. We apply the methodology to combine structured datasets retrieved from official sources and a big unstructured dataset of information collected from social media. The results show that the combined use of official and online generated information contributes to yield a more accurate assessment of the evolution of the Covid-19 pandemic, compared to the sole use of official data.

摘要

Covid-19 大流行对人们的健康构成了严重威胁,在全球范围内导致了超过 2.5 亿例确诊病例和超过 500 万人死亡。为了减轻国家卫生保健系统的负担,并减轻疫情的影响,需要使用准确的短期和长期卫生需求建模和预测方法,为旨在遏制大流行的政府干预措施提供信息。目前针对 Covid-19 的研究通常基于单一信息来源,特别是结构化的历史大流行数据。其他研究则专门侧重于非结构化的在线检索见解,例如社交媒体上提供的数据。但是,结构化和非结构化信息的结合使用仍未被探索。本文旨在通过利用历史和社交媒体信息并结合新颖的数据集成方法来填补这一空白。该方法基于藤蔓 Copula,这使我们能够利用不同信息源之间的依赖关系。我们应用该方法将从官方来源检索的结构化数据集和从社交媒体收集的大型非结构化信息数据集结合在一起。结果表明,与仅使用官方数据相比,官方和在线生成信息的结合使用有助于更准确地评估 Covid-19 大流行的演变。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/752d/10105699/ebc940f5aacf/41598_2023_33141_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验