Bürkner Paul-Christian
Cluster of Excellence SimTech, University of Stuttgart, Stuttgart, Germany.
Psychometrika. 2022 Dec;87(4):1439-1472. doi: 10.1007/s11336-022-09843-z. Epub 2022 Feb 8.
Personality tests employing comparative judgments have been proposed as an alternative to Likert-type rating scales. One of the main advantages of a comparative format is that it can reduce faking of responses in high-stakes situations. However, previous research has shown that it is highly difficult to obtain trait score estimates that are both faking resistant and sufficiently accurate for individual-level diagnostic decisions. With the goal of contributing to a solution, I study the information obtainable from comparative judgments analyzed by means of Thurstonian IRT models. First, I extend the mathematical theory of ordinal comparative judgments and corresponding models. Second, I provide optimal test designs for Thurstonian IRT models that maximize the accuracy of people's trait score estimates from both frequentist and Bayesian statistical perspectives. Third, I derive analytic upper bounds for the accuracy of these trait estimates achievable through ordinal Thurstonian IRT models. Fourth, I perform numerical experiments that complement results obtained in earlier simulation studies. The combined analytical and numerical results suggest that it is indeed possible to design personality tests using comparative judgments that yield trait scores estimates sufficiently accurate for individual-level diagnostic decisions, while reducing faking in high-stakes situations. Recommendations for the practical application of comparative judgments for the measurement of personality, specifically in high-stakes situations, are given.
采用比较判断的人格测试已被提议作为李克特式量表的替代方法。比较形式的主要优点之一是它可以减少在高风险情况下的虚假回答。然而,先前的研究表明,要获得既抗伪装又足够准确用于个体水平诊断决策的特质分数估计非常困难。为了有助于找到解决方案,我研究了通过瑟斯顿IRT模型分析比较判断可获得的信息。首先,我扩展了序数比较判断的数学理论和相应模型。其次,我为瑟斯顿IRT模型提供了最优测试设计,从频率主义和贝叶斯统计角度最大化人们特质分数估计的准确性。第三,我推导了通过序数瑟斯顿IRT模型可实现的这些特质估计准确性的解析上界。第四,我进行了数值实验,补充了早期模拟研究的结果。综合分析和数值结果表明,确实有可能设计出使用比较判断的人格测试,这些测试能产生足够准确用于个体水平诊断决策的特质分数估计,同时减少高风险情况下的伪装行为。文中给出了比较判断在人格测量中的实际应用建议,特别是在高风险情况下。