Gomes Manuel, Kenward Michael G, Grieve Richard, Carpenter James
Department of Applied Health Research, University College London, London, UK.
Department of Medical Statistics, LSHTM, London, UK.
Stat Med. 2020 May 20;39(11):1658-1674. doi: 10.1002/sim.8504. Epub 2020 Feb 14.
Nonignorable missing data poses key challenges for estimating treatment effects because the substantive model may not be identifiable without imposing further assumptions. For example, the Heckman selection model has been widely used for handling nonignorable missing data but requires the study to make correct assumptions, both about the joint distribution of the missingness and outcome and that there is a valid exclusion restriction. Recent studies have revisited how alternative selection model approaches, for example estimated by multiple imputation (MI) and maximum likelihood, relate to Heckman-type approaches in addressing the first hurdle. However, the extent to which these different selection models rely on the exclusion restriction assumption with nonignorable missing data is unclear. Motivated by an interventional study (REFLUX) with nonignorable missing outcome data in half of the sample, this article critically examines the role of the exclusion restriction in Heckman, MI, and full-likelihood selection models when addressing nonignorability. We explore the implications of the different methodological choices concerning the exclusion restriction for relative bias and root-mean-squared error in estimating treatment effects. We find that the relative performance of the methods differs in practically important ways according to the relevance and strength of the exclusion restriction. The full-likelihood approach is less sensitive to alternative assumptions about the exclusion restriction than Heckman-type models and appears an appropriate method for handling nonignorable missing data. We illustrate the implications of method choice for inference in the REFLUX study, which evaluates the effect of laparoscopic surgery on long-term quality of life for patients with gastro-oseophageal reflux disease.
不可忽视的缺失数据给治疗效果评估带来了关键挑战,因为如果不施加进一步的假设,实质性模型可能无法识别。例如,赫克曼选择模型已被广泛用于处理不可忽视的缺失数据,但要求研究对缺失和结果的联合分布以及存在有效的排除限制做出正确假设。最近的研究重新审视了替代选择模型方法,例如通过多重插补(MI)和最大似然估计的方法,在解决第一个障碍时与赫克曼型方法的关系。然而,这些不同的选择模型在处理不可忽视的缺失数据时依赖排除限制假设的程度尚不清楚。受一项干预性研究(反流研究)的启发,该研究中有一半样本存在不可忽视的缺失结果数据,本文批判性地研究了在解决不可忽视性问题时,排除限制在赫克曼、MI和全似然选择模型中的作用。我们探讨了关于排除限制的不同方法选择对估计治疗效果时的相对偏差和均方根误差的影响。我们发现,根据排除限制的相关性和强度,这些方法的相对性能在实际重要方面存在差异。全似然方法比赫克曼型模型对关于排除限制的替代假设不太敏感,似乎是处理不可忽视的缺失数据的合适方法。我们阐述了方法选择对反流研究中推断的影响,该研究评估了腹腔镜手术对胃食管反流病患者长期生活质量的影响。