Imai Hissei, Furukawa Toshiaki A, Kasahara Yoriko, Ishimoto Yasuko, Kimura Yumi, Fukutomi Eriko, Chen Wen-Ling, Tanaka Mire, Sakamoto Ryota, Wada Taizo, Fujisawa Michiko, Okumiya Kiyohito, Matsubayashi Kozo
Department of Field Medicine, Graduate School of Medicine, Kyoto University, Kyoto, Japan.
Psychogeriatrics. 2014 Sep;14(3):182-7. doi: 10.1111/psyg.12060.
Missing data are inevitable in almost all medical studies. Imputation methods using the probabilistic model are common, but they cannot impute individual data and require special software. In contrast, the ipsative imputation method, which substitutes the missing items by the mean of the remaining items within the individual, is easy and does not need any special software, but it can provide individual scores. The aim of the present study was to evaluate the validity of the ipsative imputation method using data involving the 15-item Geriatric Depression Scale.
Participants were community-dwelling elderly individuals (n = 1178). A structural equation model was constructed. The model fit indexes were calculated to assess the validity of the imputation method when it is used for individuals who were missing 20% of data or less and 40% of data or less, depending on whether we assumed that their correlation coefficients were the same as the dataset with no missing items. Finally, we compared path coefficients of the dataset imputed by ipsative imputation with those by multiple imputation.
When compared with the assumption that the datasets differed, all of the model fit indexes were better under the assumption that the dataset without missing data is the same as that that was missing 20% of data or less. However, by the same assumption, the model fit indexes were worse in the dataset that was missing 40% of data or less. The path coefficients of the dataset imputed by ipsative imputation and by multiple imputation were compatible with each other if the proportion of missing items was 20% or less.
Ipsative imputation appears to be a valid imputation method and can be used to impute data in studies using the 15-item Geriatric Depression Scale, if the percentage of its missing items is 20% or less.
在几乎所有医学研究中,缺失数据都不可避免。使用概率模型的插补方法很常见,但它们无法插补个体数据,且需要特殊软件。相比之下,同侧性插补方法通过个体内其余项目的均值替代缺失项目,操作简便且无需任何特殊软件,还能提供个体得分。本研究的目的是使用涉及15项老年抑郁量表的数据评估同侧性插补方法的有效性。
参与者为社区居住的老年人(n = 1178)。构建了一个结构方程模型。根据是否假设其相关系数与无缺失项目的数据集相同,计算模型拟合指数,以评估该插补方法用于缺失数据20%及以下和40%及以下个体时的有效性。最后,我们比较了同侧性插补法插补的数据集与多重插补法插补的数据集的路径系数。
与假设数据集不同相比,在假设无缺失数据的数据集与缺失20%及以下数据的数据集相同时,所有模型拟合指数都更好。然而,基于相同假设,在缺失40%及以下数据的数据集里,模型拟合指数更差。如果缺失项目比例为20%及以下,同侧性插补法插补的数据集和多重插补法插补的数据集的路径系数相互兼容。
同侧性插补似乎是一种有效的插补方法,可用于使用15项老年抑郁量表的研究中的数据插补,前提是其缺失项目的百分比为20%及以下。