Maydeu-Olivares Albert, Cai Li
Multivariate Behav Res. 2006 Mar 1;41(1):55-64. doi: 10.1207/s15327906mbr4101_4.
The likelihood ratio test statistic G(2)(dif) is widely used for comparing the fit of nested models in categorical data analysis. In large samples, this statistic is distributed as a chi-square with degrees of freedom equal to the difference in degrees of freedom between the tested models, but only if the least restrictive model is correctly specified. Yet, this statistic is often used in applications without assessing the adequacy of the least restrictive model. This may result in incorrect substantive conclusions as the above large sample reference distribution for G(2)(dif) is no longer appropriate. Rather, its large sample distribution will depend on the degree of model misspecification of the least restrictive model. To illustrate this, a simulation study is performed where this statistic is used to compare nested item response theory models under various degrees of misspecification of the least restrictive model. G(2)(dif) was found to be robust only under small model misspecification of the least restrictive model. Consequently, we argue that some indication of the absolute goodness of fit of the least restrictive model is needed before employing G(2)(dif) to assess relative model fit.
似然比检验统计量G(2)(dif)在分类数据分析中被广泛用于比较嵌套模型的拟合优度。在大样本中,该统计量服从自由度等于被检验模型自由度之差的卡方分布,但前提是最宽松模型被正确设定。然而,在实际应用中,该统计量常常在未评估最宽松模型是否合适的情况下就被使用。这可能会导致得出错误的实质性结论,因为上述G(2)(dif)的大样本参考分布不再适用。相反,其大样本分布将取决于最宽松模型的模型误设程度。为了说明这一点,我们进行了一项模拟研究,在该研究中,使用该统计量在最宽松模型的不同程度误设下比较嵌套的项目反应理论模型。结果发现,只有在最宽松模型的模型误设较小时,G(2)(dif)才具有稳健性。因此,我们认为在使用G(2)(dif)评估相对模型拟合之前,需要对最宽松模型的绝对拟合优度有一定的指示。