Papageorgiou Grigorios, Grant Stuart W, Takkenberg Johanna J M, Mokhles Mostafa M
Department of Cardiothoracic Surgery, Erasmus University Medical Center, Rotterdam, Netherlands.
Department of Biostatistics, Erasmus University Medical Center, Rotterdam, Netherlands.
Interact Cardiovasc Thorac Surg. 2018 Aug 1;27(2):153-158. doi: 10.1093/icvts/ivy102.
Missing data are a common challenge encountered in research which can compromise the results of statistical inference when not handled appropriately. This paper aims to introduce basic concepts of missing data to a non-statistical audience, list and compare some of the most popular approaches for handling missing data in practice and provide guidelines and recommendations for dealing with and reporting missing data in scientific research. Complete case analysis and single imputation are simple approaches for handling missing data and are popular in practice, however, in most cases they are not guaranteed to provide valid inferences. Multiple imputation is a robust and general alternative which is appropriate for data missing at random, surpassing the disadvantages of the simpler approaches, but should always be conducted with care. The aforementioned approaches are illustrated and compared in an example application using Cox regression.
缺失数据是研究中常见的挑战,若处理不当可能会影响统计推断的结果。本文旨在向非统计学专业人士介绍缺失数据的基本概念,列举并比较一些实际中处理缺失数据最常用的方法,并为科研中处理和报告缺失数据提供指导方针和建议。完全病例分析和单一填补是处理缺失数据的简单方法,在实际中很常用,然而,在大多数情况下,它们并不能保证提供有效的推断。多重填补是一种稳健且通用的替代方法,适用于随机缺失的数据,克服了简单方法的缺点,但使用时应始终谨慎。在一个使用Cox回归的示例应用中对上述方法进行了说明和比较。