Stanley Leanne M, Edwards Michael C
The Ohio State University, Columbus, OH, USA.
Educ Psychol Meas. 2016 Dec;76(6):976-985. doi: 10.1177/0013164416638900. Epub 2016 Mar 17.
The purpose of this article is to highlight the distinction between the reliability of test scores and the fit of psychometric measurement models, reminding readers why it is important to consider both when evaluating whether test scores are valid for a proposed interpretation and/or use. It is often the case that an investigator judges both the reliability of scores and the fit of a corresponding measurement model to be either acceptable or unacceptable for a given situation, but these are not the only possible outcomes. This article focuses on situations in which model fit is deemed acceptable, but reliability is not. Data were simulated based on the item characteristics of the PROMIS (Patient Reported Outcomes Measurement Information System) anxiety item bank and analyzed using methods from classical test theory, factor analysis, and item response theory. Analytic techniques from different psychometric traditions were used to illustrate that reliability and model fit are distinct, and that disagreement among indices of reliability and model fit may provide important information bearing on a particular validity argument, independent of the data analytic techniques chosen for a particular research application. We conclude by discussing the important information gleaned from the assessment of reliability and model fit.
本文的目的是突出测试分数的可靠性与心理测量模型的拟合度之间的区别,提醒读者在评估测试分数对于所提议的解释和/或用途是否有效时,同时考虑这两者为何重要。通常情况下,研究者会判断分数的可靠性和相应测量模型的拟合度在给定情境中是可接受还是不可接受,但这些并非仅有的可能结果。本文聚焦于模型拟合度被认为可接受但可靠性不可接受的情况。数据是基于患者报告结果测量信息系统(PROMIS)焦虑项目库的项目特征进行模拟的,并使用经典测试理论、因子分析和项目反应理论的方法进行分析。来自不同心理测量传统的分析技术被用于说明可靠性和模型拟合度是不同的,并且可靠性指标和模型拟合度指标之间的不一致可能会提供与特定效度论证相关的重要信息,而与为特定研究应用选择的数据分析技术无关。我们通过讨论从可靠性和模型拟合度评估中收集到的重要信息来得出结论。