Department of Methodology and Statistics TSB, Tilburg University, PO Box 90153, 5000LE, Tilburg, The Netherlands.
Open University of The Netherlands, Heerlen, The Netherlands.
Psychometrika. 2024 Dec;89(4):1175-1185. doi: 10.1007/s11336-024-10004-7. Epub 2024 Oct 30.
In this rejoinder to McNeish (2024) and Mislevy (2024), who both responded to our focus article on the merits of the simple sum score (Sijtsma et al., 2024), we address several issues. Psychometrics education and in particular psychometricians' outreach may help researchers to use IRT models as a precursor for the responsible use of the latent variable score and the sum score. Different methods used for test and questionnaire construction often do not produce highly different results, and when they do, this may be due to an unarticulated attribute theory generating noisy data. The sum score and transformations thereof, such as normalized test scores and percentiles, may help test practitioners and their clients to better communicate results. Latent variables prove important in more advanced applications such as equating and adaptive testing where they serve as technical tools rather than communication devices. Decisions based on test results are often binary or use a rather coarse ordering of scale levels, hence, do not require a high level of granularity (but nevertheless need to be precise). A gap exists between psychology and psychometrics which is growing deeper and wider, and that needs to be bridged. Psychology and psychometrics must work together to attain this goal.
在回应 McNeish(2024)和 Mislevy(2024)的观点时,我们讨论了几个问题。心理测量学教育,特别是心理测量学家的外展工作,可以帮助研究人员将IRT 模型作为负责任地使用潜在变量分数和总和分数的前提。不同的测试和问卷构建方法通常不会产生高度不同的结果,而当它们产生不同的结果时,这可能是由于未阐明的属性理论产生了嘈杂的数据。总和分数及其转换,例如标准化测试分数和百分位数,可以帮助测试实践者及其客户更好地传达结果。潜在变量在更高级的应用中(如等距和自适应测试)非常重要,它们是作为技术工具而不是沟通工具。基于测试结果的决策通常是二进制的,或者使用相当粗糙的量表级别排序,因此,不需要很高的粒度(但仍需要精确)。心理学和心理测量学之间存在差距,而且这个差距越来越大,需要弥合。心理学和心理测量学必须共同努力才能实现这一目标。