Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK.
Department of Health Services Research and Policy, London School of Hygiene and Tropical Medicine, London, UK.
Stat Med. 2020 May 20;39(11):1641-1657. doi: 10.1002/sim.8503. Epub 2020 Feb 27.
Electronic health records are a valuable data source for investigating health-related questions, and propensity score analysis has become an increasingly popular approach to address confounding bias in such investigations. However, because electronic health records are typically routinely recorded as part of standard clinical care, there are often missing values, particularly for potential confounders. In our motivating study-using electronic health records to investigate the effect of renin-angiotensin system blockers on the risk of acute kidney injury-two key confounders, ethnicity and chronic kidney disease stage, have 59% and 53% missing data, respectively. The missingness pattern approach (MPA), a variant of the missing indicator approach, has been proposed as a method for handling partially observed confounders in propensity score analysis. In the MPA, propensity scores are estimated separately for each missingness pattern present in the data. Although the assumptions underlying the validity of the MPA are stated in the literature, it can be difficult in practice to assess their plausibility. In this article, we explore the MPA's underlying assumptions by using causal diagrams to assess their plausibility in a range of simple scenarios, drawing general conclusions about situations in which they are likely to be violated. We present a framework providing practical guidance for assessing whether the MPA's assumptions are plausible in a particular setting and thus deciding when the MPA is appropriate. We apply our framework to our motivating study, showing that the MPA's underlying assumptions appear reasonable, and we demonstrate the application of MPA to this study.
电子健康记录是调查与健康相关问题的有价值的数据源,而倾向评分分析已成为解决此类调查中混杂偏差的一种越来越受欢迎的方法。然而,由于电子健康记录通常是作为标准临床护理的一部分常规记录的,因此经常会出现缺失值,尤其是对于潜在的混杂因素。在我们的动机研究中——使用电子健康记录来调查肾素-血管紧张素系统阻滞剂对急性肾损伤风险的影响——两个关键的混杂因素,种族和慢性肾脏病阶段,分别有 59%和 53%的数据缺失。缺失模式方法(MPA)是缺失指示符方法的一种变体,已被提议作为处理倾向评分分析中部分观察到的混杂因素的一种方法。在 MPA 中,为数据中存在的每种缺失模式分别估计倾向得分。尽管文献中陈述了 MPA 有效性的假设,但在实践中评估其合理性可能很困难。在本文中,我们使用因果图来评估 MPA 在一系列简单场景中的基本假设的合理性,从而得出关于它们可能违反的情况的一般结论。我们提出了一个框架,为评估 MPA 在特定环境中的假设是否合理提供了实际指导,并因此决定何时使用 MPA。我们将我们的框架应用于我们的动机研究,表明 MPA 的基本假设是合理的,我们还展示了 MPA 在这项研究中的应用。