Walani Salimah R, Cleland Charles M
March of Dimes Foundation, White Plains, NY, US.
Nurse Res. 2015 May;22(5):13-9. doi: 10.7748/nr.22.5.13.e1319.
To illustrate with the example of a secondary data analysis study the use of the multiple imputation method to replace missing data.
Most large public datasets have missing data, which need to be handled by researchers conducting secondary data analysis studies. Multiple imputation is a technique widely used to replace missing values while preserving the sample size and sampling variability of the data.
The 2004 National Sample Survey of Registered Nurses.
The authors created a model to impute missing values using the chained equation method. They used imputation diagnostics procedures and conducted regression analysis of imputed data to determine the differences between the log hourly wages of internationally educated and US-educated registered nurses.
The authors used multiple imputation procedures to replace missing values in a large dataset with 29,059 observations. Five multiple imputed datasets were created. Imputation diagnostics using time series and density plots showed that imputation was successful. The authors also present an example of the use of multiple imputed datasets to conduct regression analysis to answer a substantive research question.
Multiple imputation is a powerful technique for imputing missing values in large datasets while preserving the sample size and variance of the data. Even though the chained equation method involves complex statistical computations, recent innovations in software and computation have made it possible for researchers to conduct this technique on large datasets.
IMPLICATIONS FOR RESEARCH/PRACTICE: The authors recommend nurse researchers use multiple imputation methods for handling missing data to improve the statistical power and external validity of their studies.
以一项二次数据分析研究为例,说明使用多重填补法来替换缺失数据。
大多数大型公共数据集都存在缺失数据,进行二次数据分析研究的人员需要对其进行处理。多重填补是一种广泛使用的技术,用于在保留数据样本量和抽样变异性的同时替换缺失值。
2004年注册护士全国抽样调查。
作者创建了一个模型,使用链式方程法对缺失值进行填补。他们采用填补诊断程序,并对填补后的数据进行回归分析,以确定国际教育背景和美国教育背景的注册护士每小时工资对数之间的差异。
作者使用多重填补程序,对一个包含29,059条观测值的大型数据集的缺失值进行替换。创建了五个多重填补数据集。使用时间序列和密度图进行的填补诊断表明填补是成功的。作者还给出了一个使用多重填补数据集进行回归分析以回答实质性研究问题的示例。
多重填补是一种强大的技术,可在保留数据样本量和方差的同时,对大型数据集中的缺失值进行填补。尽管链式方程法涉及复杂的统计计算,但软件和计算方面的最新创新使研究人员能够在大型数据集上应用该技术。
对研究/实践的启示:作者建议护士研究人员使用多重填补方法处理缺失数据,以提高其研究的统计效力和外部效度。