Midodzi William K, Hayduk Leslie, Cummings Greta G, Estabrooks Carole A, Wallin Lars
Department of Public Health Sciences, University of Alberta, Edmonton, Canada.
Nurs Res. 2007 Jul-Aug;56(4 Suppl):S47-52. doi: 10.1097/01.NNR.0000280633.94149.19.
When doing secondary data analysis, it is not uncommon to find that a key variable was not measured. Often the researcher has no option but to do without the missing indicator, but when nearly parallel datasets exist, the researcher may have other options. In an earlier article leading up to this special issue, this research team was confronted with the problem that research utilization had been measured in only one of two similar datasets, namely, in the 1996 but not the 1998 Alberta Registered Nurse survey. The 1998 dataset had a larger sample size (6,526 compared to 600 nurse respondents in 1996) and a stronger set of measured variables, but was missing the key variable of interest--research utilization. To overcome this, a regression-based strategy was used to create a research utilization score for each nurse in the 1998 survey by exploiting the availability of several anticipated causes of research utilization in both datasets. Presented here is an alternative and more complicated procedure that might be applied in future investigations. The article presents a methodological understanding of how to use a phantom variable to account for the unmeasured research utilization variable in a two-group structural equation model. This approach could be used to overcome several of the limitations connected to using a regression-based approach to creating a key missing variable when nearly parallel datasets are available.
在进行二次数据分析时,经常会发现关键变量未被测量。通常情况下,研究人员别无选择,只能在没有缺失指标的情况下开展研究,但当存在近乎平行的数据集时,研究人员可能会有其他选择。在本期特刊之前的一篇文章中,该研究团队就面临这样一个问题:在两个相似的数据集中,只有一个测量了研究利用率,即1996年的阿尔伯塔省注册护士调查测量了,而1998年的未测量。1998年的数据集样本量更大(6526人,而1996年为600名护士受访者),测量变量集更强,但缺少关键变量——研究利用率。为克服这一问题,采用了一种基于回归的策略,通过利用两个数据集中几个预期的研究利用率影响因素,为1998年调查中的每位护士创建一个研究利用率得分。本文介绍了一种可供选择且更为复杂的程序,可应用于未来的调查。本文阐述了如何在两组结构方程模型中使用虚拟变量来处理未测量的研究利用率变量的方法。当有近乎平行的数据集时,这种方法可用于克服使用基于回归的方法创建关键缺失变量时存在的一些局限性。