Jourdain Frédéric, Chakraborty Debapriyo, Gaillard Beatrice, Gautier Arnaud, Simard Frédéric, Robert Pierre Jay, Dormont Laurent, Desenclos Jean-Claude, Roche Benjamin
Santé Publique France, Saint-Maurice, France.
MIVEGEC, University of Montpellier, CNRS, IRD, Montpellier, France.
PLOS Glob Public Health. 2025 Jul 24;5(7):e0004889. doi: 10.1371/journal.pgph.0004889. eCollection 2025.
Human behavior is known to be a fundamental, yet often neglected, component of infectious disease epidemiology, especially during outbreaks. To quantify its role and fluctuations, analyzing message contents on popular online social networks - part of so-called digital epidemiology - is a promising approach. However, such methods could be biased and generate estimation errors since social media users may not be representative of the general population. To address this, we systematically compared social media-derived estimates with those obtained from a large-scale opinion survey. In the setting of metropolitan France, where the risk of arbovirus outbreaks is increasingly important, we compared the frequency of three types of emotional states related to human-mosquito contact identified in 160,000 messages on X (formerly Twitter) with the frequency of the same emotional states expressed through a large-scale opinion survey involving 15,000 people during the same period. Both sources of data were used to parametrize a mathematical model of mosquito-borne virus transmission. We found that estimates of these emotional states for different age groups in the opinion survey could be highly different from estimates based on X data. Nevertheless, by integrating demographic adjustments and incorporating variability into our transmission models, we showed that the predicted overall outbreak dynamics remain comparable under certain conditions. This study provides the first evidence that using digital social network data to infer epidemiologically relevant behavior achieves similar results as using large-scale opinion survey data. These outcomes highlight that X data could be used to help forecast outbreaks dynamics, opening new opportunities for real-time assessment of human health-related behavior and the definition of control strategies.
众所周知,人类行为是传染病流行病学的一个基本但常被忽视的组成部分,在疫情爆发期间尤其如此。为了量化其作用和波动情况,分析流行在线社交网络上的信息内容——即所谓数字流行病学的一部分——是一种很有前景的方法。然而,由于社交媒体用户可能不具有一般人群的代表性,此类方法可能存在偏差并产生估计误差。为了解决这个问题,我们系统地比较了社交媒体得出的估计值与大规模民意调查获得的估计值。在法国大都市地区,虫媒病毒爆发的风险日益增加,我们将在X(前身为推特)上16万条信息中识别出的与人类与蚊子接触相关的三种情绪状态的出现频率,与同期一项涉及1.5万人的大规模民意调查中表达的相同情绪状态的频率进行了比较。这两种数据来源都被用于为蚊媒病毒传播的数学模型设定参数。我们发现,民意调查中不同年龄组对这些情绪状态的估计可能与基于X数据的估计有很大差异。尽管如此,通过在传播模型中纳入人口统计学调整并考虑变异性,我们表明在某些条件下预测的总体疫情动态仍然具有可比性。这项研究首次证明,使用数字社交网络数据推断与流行病学相关的行为,与使用大规模民意调查数据能取得类似结果。这些结果表明,X数据可用于帮助预测疫情动态,为实时评估与人类健康相关的行为及制定控制策略开辟了新机会。