Department of Public Health Sciences, College of Health and Human Services, University of North Carolina at Charlotte, Charlotte, NC, United States.
School of Data Science, University of North Carolina at Charlotte, Charlotte, NC, United States.
Front Public Health. 2023 Mar 16;11:1111661. doi: 10.3389/fpubh.2023.1111661. eCollection 2023.
Comprehensive surveillance systems are the key to provide accurate data for effective modeling. Traditional symptom-based case surveillance has been joined with recent genomic, serologic, and environment surveillance to provide more integrated disease surveillance systems. A major gap in comprehensive disease surveillance is to accurately monitor potential population behavioral changes in real-time. Population-wide behaviors such as compliance with various interventions and vaccination acceptance significantly influence and drive the overall epidemic dynamics in the society. Original infoveillance utilizes online query data (e.g., Google and Wikipedia search of a specific content topic such as an epidemic) and later focuses on large volumes of online discourse data about the from social media platforms and further augments epidemic modeling. It mainly uses number of posts to approximate public awareness of the disease, and further compares with observed epidemic dynamics for better projection. The current COVID-19 pandemic shows that there is an urgency to further harness the rich, detailed content and sentiment information, which can provide more accurate and granular information on public awareness and perceptions toward multiple aspects of the disease, especially various interventions. In this perspective paper, we describe a novel conceptual analytical framework of content and sentiment infoveillance (CSI) and integration with epidemic modeling. This CSI framework includes data retrieval and pre-processing; information extraction natural language processing to identify and quantify detailed time, location, content, and sentiment information; and integrating infoveillance with common epidemic modeling techniques of both mechanistic and data-driven methods. CSI complements and significantly enhances current epidemic models for more informed decision by integrating behavioral aspects from detailed, instantaneous infoveillance from massive social media data.
综合监测系统是提供有效建模所需准确数据的关键。传统的基于症状的病例监测已与最近的基因组学、血清学和环境监测相结合,提供了更综合的疾病监测系统。全面疾病监测的一个主要差距是实时准确监测潜在的人群行为变化。全面监测利用在线查询数据(例如,对特定内容主题(如传染病)的 Google 和 Wikipedia 搜索),并侧重于社交媒体平台上的大量在线话语数据,并进一步增强传染病模型。它主要使用帖子数量来近似公众对疾病的认识,并进一步将其与观察到的传染病动态进行比较,以进行更好的预测。当前的 COVID-19 大流行表明,迫切需要进一步利用丰富、详细的内容和情感信息,为公众对疾病的多个方面(特别是各种干预措施)的认识和看法提供更准确和更详细的信息。在本观点文章中,我们描述了一种新的内容和情感信息监测(CSI)的概念分析框架及其与传染病建模的整合。该 CSI 框架包括数据检索和预处理;信息提取和自然语言处理,以识别和量化详细的时间、地点、内容和情感信息;以及将信息监测与机械和数据驱动方法的常见传染病建模技术相结合。CSI 通过整合来自大规模社交媒体数据的详细、即时信息监测中的行为方面,为更明智的决策提供了补充,并显著增强了当前的传染病模型。