Vanier Antoine, Sébille Véronique, Blanchin Myriam, Guilleux Alice, Hardouin Jean-Benoit
EA 4275 Biostatistics Pharmacoepidemiology and Subjective Measures in Health Sciences, LUNAM, University of Nantes, Nantes, France,
Qual Life Res. 2015 Aug;24(8):1799-807. doi: 10.1007/s11136-015-0938-2. Epub 2015 Feb 11.
This simulation study was designed to provide data on the performance of Oort's procedure (OP) for response shift (RS) detection (regarding type I error, power, and overall performance), according to sample characteristics, at item level. A specific objective was to assess the impact of using different information criteria (IC), as alternatives to the LRT (likelihood-ratio test), for global assessment of RS occurrence.
Responses to five binary items at two times of measurement were simulated. Thirty-six combinations of sample characteristics [sample size (n), "true change," correlations between the two latent variables and presence/absence of uniform recalibration RS (ur)] were considered. A thousand datasets were generated for each combination. RS detection was performed on each dataset following OP. Type I error and power of the global assessment of RS occurrence, as well as overall performance of the OP, were assessed.
The estimated type I error was close to 5 % for the LRT and lower than 5 % for the IC. The estimated power was higher for the LRT as compared to the AIC, which was the highest among the other IC. For the LRT, the estimated power for n = 100 and for the combination of n = 200 and ur = 1 item was below 80 %. Otherwise, for other combinations of sample characteristics, the estimated power was above 90 %.
For the LRT, higher values of power were estimated compared to IC with appropriate values of type I error. These results were consistent with Oort's proposal to use the LRT as the criterion to assess global RS occurrence.
本模拟研究旨在根据样本特征,在项目层面提供关于奥尔特程序(OP)用于反应转移(RS)检测(关于I型错误、功效和整体性能)的数据。一个具体目标是评估使用不同信息准则(IC)作为似然比检验(LRT)的替代方法对RS发生的全局评估的影响。
模拟了在两次测量时对五个二元项目的反应。考虑了样本特征的36种组合[样本量(n)、“真实变化”、两个潜在变量之间的相关性以及是否存在均匀重新校准RS(ur)]。每种组合生成1000个数据集。按照OP对每个数据集进行RS检测。评估了RS发生全局评估的I型错误和功效,以及OP的整体性能。
LRT的估计I型错误接近5%,IC的估计I型错误低于5%。与AIC相比,LRT的估计功效更高,AIC在其他IC中是最高的。对于LRT,n = 100以及n = 200和ur = 1项目组合的估计功效低于80%。否则,对于其他样本特征组合,估计功效高于90%。
对于LRT,与具有适当I型错误值的IC相比,估计的功效值更高。这些结果与奥尔特提出的使用LRT作为评估全局RS发生的标准一致。