He Yulei
Department of Health Care Policy, Harvard Medical School, 180 Longwood Ave, Boston, MA 02115, USA.
Circ Cardiovasc Qual Outcomes. 2010 Jan;3(1):98-105. doi: 10.1161/CIRCOUTCOMES.109.875658.
Missing data are a pervasive problem in health investigations. We describe some background of missing data analysis and criticize ad hoc methods that are prone to serious problems. We then focus on multiple imputation, in which missing cases are first filled in by several sets of plausible values to create multiple completed datasets, then standard complete-data procedures are applied to each completed dataset, and finally the multiple sets of results are combined to yield a single inference. We introduce the basic concepts and general methodology and provide some guidance for application. For illustration, we use a study assessing the effect of cardiovascular diseases on hospice discussion for late stage lung cancer patients.
缺失数据是健康调查中普遍存在的问题。我们描述了一些缺失数据分析的背景,并批评了容易出现严重问题的临时方法。然后我们专注于多重填补,即首先用几组合理的值填补缺失的病例以创建多个完整的数据集,接着将标准的完整数据程序应用于每个完整的数据集,最后合并多组结果以得出单一推断。我们介绍了基本概念和一般方法,并提供了一些应用指导。为了说明,我们使用了一项评估心血管疾病对晚期肺癌患者临终关怀讨论影响的研究。