Chin Taylor, Johansson Michael A, Chowdhury Anir, Chowdhury Shayan, Hosan Kawsar, Quader Md Tanvir, Buckee Caroline O, Mahmud Ayesha S
Center for Communicable Disease Dynamics, Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
Bouvé College of Health Sciences & Network Science Institute, Northeastern University, MA, Boston, USA.
Commun Med (Lond). 2025 Jan 7;5(1):8. doi: 10.1038/s43856-024-00714-5.
Digital data sources such as mobile phone call detail records (CDRs) are increasingly being used to estimate population mobility fluxes and to predict the spatiotemporal dynamics of infectious disease outbreaks. Differences in mobile phone operators' geographic coverage, however, may result in biased mobility estimates.
We leverage a unique dataset consisting of CDRs from three mobile phone operators in Bangladesh and digital trace data from Meta's Data for Good program to compare mobility patterns across these sources. We use a metapopulation model to compare the sources' effects on simulated outbreak trajectories, and compare results with a benchmark model with data from all three operators, representing around 100 million subscribers across the country.
We show that mobility sources can vary significantly in their coverage of travel routes and geographic mobility patterns. Differences in projected outbreak dynamics are more pronounced at finer spatial scales, especially if the outbreak is seeded in smaller and/or geographically isolated regions. In some instances, a simple diffusion (gravity) model was better able to capture the timing and spatial spread of the outbreak compared to the sparser mobility sources.
Our results highlight the potential biases in predicted outbreak dynamics from a metapopulation model parameterized with non-population representative data, and the limits to the generalizability of models built on these types of novel human behavioral data.
诸如手机通话详单记录(CDR)之类的数字数据源越来越多地被用于估计人口流动通量,并预测传染病爆发的时空动态。然而,手机运营商地理覆盖范围的差异可能导致流动估计出现偏差。
我们利用了一个独特的数据集,该数据集由来自孟加拉国三家手机运营商的CDR以及Meta公司“数据为善”计划的数字轨迹数据组成,以比较这些数据源的流动模式。我们使用一个集合种群模型来比较这些数据源对模拟疫情轨迹的影响,并将结果与一个基准模型进行比较,该基准模型的数据来自所有三家运营商,代表了全国约1亿用户。
我们表明,流动数据源在旅行路线覆盖范围和地理流动模式方面可能存在显著差异。在更精细的空间尺度上,预计的疫情动态差异更为明显,尤其是当疫情在较小和/或地理上孤立的地区爆发时。在某些情况下,与稀疏的流动数据源相比,一个简单的扩散(引力)模型能够更好地捕捉疫情的时间和空间传播。
我们的结果凸显了用非人口代表性数据参数化的集合种群模型在预测疫情动态时可能存在的偏差,以及基于这类新型人类行为数据构建的模型的泛化能力的局限性。