Bretó Carles, Ionides Edward L, King Aaron A
Department of Statistics, University of Michigan, Ann Arbor, MI.
Departament d'Anàlisi Econòmica, Universitat de València, València, Spain.
J Am Stat Assoc. 2019 Jun 7;115(531):1178-1188. doi: 10.1080/01621459.2019.1604367.
Panel data, also known as longitudinal data, consist of a collection of time series. Each time series, which could itself be multivariate, comprises a sequence of measurements taken on a distinct unit. Mechanistic modeling involves writing down scientifically motivated equations describing the collection of dynamic systems giving rise to the observations on each unit. A defining characteristic of panel systems is that the dynamic interaction between units should be negligible. Panel models therefore consist of a collection of independent stochastic processes, generally linked through shared parameters while also having unit-specific parameters. To give the scientist flexibility in model specification, we are motivated to develop a framework for inference on panel data permitting the consideration of arbitrary nonlinear, partially observed panel models. We build on iterated filtering techniques that provide likelihood-based inference on nonlinear partially observed Markov process models for time series data. Our methodology depends on the latent Markov process only through simulation; this plug-and-play property ensures applicability to a large class of models. We demonstrate our methodology on a toy example and two epidemiological case studies. We address inferential and computational issues arising due to the combination of model complexity and dataset size. Supplementary materials for this article are available online.
面板数据,也称为纵向数据,由一系列时间序列组成。每个时间序列本身可以是多变量的,它包含对一个不同单元进行的一系列测量。机理建模涉及写下具有科学依据的方程,这些方程描述了产生每个单元观测值的动态系统集合。面板系统的一个决定性特征是单元之间的动态相互作用应可忽略不计。因此,面板模型由一组独立的随机过程组成,这些过程通常通过共享参数联系在一起,同时也具有特定于单元的参数。为了让科学家在模型设定方面具有灵活性,我们有动力开发一个用于面板数据推断的框架,允许考虑任意非线性、部分观测的面板模型。我们基于迭代滤波技术构建,这些技术为时间序列数据的非线性部分观测马尔可夫过程模型提供基于似然的推断。我们的方法仅通过模拟依赖于潜在马尔可夫过程;这种即插即用特性确保了对一大类模型的适用性。我们在一个简单示例和两个流行病学案例研究中展示了我们的方法。我们解决了由于模型复杂性和数据集大小相结合而产生的推断和计算问题。本文的补充材料可在线获取。