Tijmstra Jesper, Bolsinova Maria
Methodology and Statisics, Tilburg University, Tilburg, Netherlands.
University of Amsterdam, Amsterdam, Netherlands.
Front Psychol. 2018 Jun 13;9:964. doi: 10.3389/fpsyg.2018.00964. eCollection 2018.
In many applications of high- and low-stakes ability tests, a non-negligible amount of respondents may fail to reach the end of the test within the specified time limit. Since for respondents that ran out of time some item responses will be missing, this raises the question of how to best deal with these missing responses for the purpose of obtaining an optimal assessment of ability. Commonly, researchers consider three general solutions: ignore the missing responses, treat them as being incorrect, or treat the responses as missing but model the missingness mechanism. This paper approaches the issue of dealing with not reached items from a measurement perspective, and considers the question what the operationalization of ability should be in maximum performance tests that work with effective time limits. We argue that the target ability that the test attempts to measure is maximum performance when operating at the test-indicated speed, and that the test instructions should be taken to imply that respondents should operate at this target speed. The phenomenon of the speed-ability trade-off informs us that the ability that is measured by the test will depend on this target speed, as different speed levels will result in different levels of performance on the same set of items. Crucially, since respondents with not reached items worked at a speed level lower than this target speed, the level of ability that they have been able to display on the items that they did reach is higher than the level of ability that they would have displayed if they had worked at the target speed (i.e., higher than their level on the target ability). Thus, statistical methods that attempt to obtain unbiased estimates of the will result in biased estimates of the . The practical implications are studied in a simulation study where different methods of dealing with not reached items are contrasted, which shows that current methods result in biased estimates of target ability when a speed-ability trade-off is present. The paper concludes with a discussion of ways in which the issue can be resolved.
在高风险和低风险能力测试的许多应用中,相当数量的受访者可能无法在规定的时间限制内完成测试。由于对于时间用完的受访者,一些项目回答将会缺失,这就提出了一个问题,即如何最好地处理这些缺失回答,以便获得对能力的最佳评估。通常,研究人员考虑三种一般解决方案:忽略缺失回答、将它们视为不正确回答,或将回答视为缺失但对缺失机制进行建模。本文从测量角度探讨处理未完成项目的问题,并考虑在有有效时间限制的最大表现测试中,能力的操作化应该是什么。我们认为,测试试图测量的目标能力是在测试指示的速度下运行时的最大表现,并且测试说明应被理解为意味着受访者应以该目标速度运行。速度 - 能力权衡现象告诉我们,测试所测量的能力将取决于这个目标速度,因为不同的速度水平会导致在同一组项目上有不同的表现水平。至关重要的是,由于有未完成项目的受访者以低于这个目标速度的水平运行,他们在已完成项目上能够展示的能力水平高于如果他们以目标速度运行时会展示的能力水平(即高于他们在目标能力上的水平)。因此,试图获得无偏估计的统计方法将导致有偏估计。在一项模拟研究中对比了处理未完成项目的不同方法,研究了实际影响,结果表明当存在速度 - 能力权衡时,当前方法会导致对目标能力的有偏估计。本文最后讨论了可以解决该问题的方法。