Thomas Michael L, Brown Gregory G, Gur Ruben C, Moore Tyler M, Patt Virginie M, Risbrough Victoria B, Baker Dewleen G
a Department of Psychiatry , University of California San Diego , La Jolla , CA , USA.
b VA Center of Excellence for Stress and Mental Health (CESAMH) , San Diego , CA , USA.
J Clin Exp Neuropsychol. 2018 Oct;40(8):745-760. doi: 10.1080/13803395.2018.1427699. Epub 2018 Feb 5.
Models from signal detection theory are commonly used to score neuropsychological test data, especially tests of recognition memory. Here we show that certain item response theory models can be formulated as signal detection theory models, thus linking two complementary but distinct methodologies. We then use the approach to evaluate the validity (construct representation) of commonly used research measures, demonstrate the impact of conditional error on neuropsychological outcomes, and evaluate measurement bias.
Signal detection-item response theory (SD-IRT) models were fitted to recognition memory data for words, faces, and objects. The sample consisted of U.S. Infantry Marines and Navy Corpsmen participating in the Marine Resiliency Study. Data comprised item responses to the Penn Face Memory Test (PFMT; N = 1,338), Penn Word Memory Test (PWMT; N = 1,331), and Visual Object Learning Test (VOLT; N = 1,249), and self-report of past head injury with loss of consciousness.
SD-IRT models adequately fitted recognition memory item data across all modalities. Error varied systematically with ability estimates, and distributions of residuals from the regression of memory discrimination onto self-report of past head injury were positively skewed towards regions of larger measurement error. Analyses of differential item functioning revealed little evidence of systematic bias by level of education.
SD-IRT models benefit from the measurement rigor of item response theory-which permits the modeling of item difficulty and examinee ability-and from signal detection theory-which provides an interpretive framework encompassing the experimentally validated constructs of memory discrimination and response bias. We used this approach to validate the construct representation of commonly used research measures and to demonstrate how nonoptimized item parameters can lead to erroneous conclusions when interpreting neuropsychological test data. Future work might include the development of computerized adaptive tests and integration with mixture and random-effects models.
信号检测理论模型常用于对神经心理学测试数据进行评分,尤其是识别记忆测试。在此,我们表明某些项目反应理论模型可被构建为信号检测理论模型,从而将两种互补但不同的方法联系起来。然后,我们使用该方法评估常用研究测量方法的有效性(结构表征),证明条件误差对神经心理学结果的影响,并评估测量偏差。
将信号检测 - 项目反应理论(SD - IRT)模型应用于单词、面孔和物体的识别记忆数据。样本包括参与海军陆战队复原力研究的美国陆军陆战队员和海军医护兵。数据包括对宾夕法尼亚面孔记忆测试(PFMT;N = 1338)、宾夕法尼亚单词记忆测试(PWMT;N = 1331)和视觉物体学习测试(VOLT;N = 1249)的项目反应,以及过去头部受伤伴意识丧失的自我报告。
SD - IRT模型充分拟合了所有模态下的识别记忆项目数据。误差随能力估计值系统地变化,并且记忆辨别回归到过去头部受伤自我报告的残差分布向测量误差较大的区域呈正偏态。项目功能差异分析显示,几乎没有证据表明存在受教育程度导致的系统偏差。
SD - IRT模型受益于项目反应理论的测量严谨性(其允许对项目难度和考生能力进行建模)以及信号检测理论(其提供了一个解释框架,涵盖了经实验验证的记忆辨别和反应偏差结构)。我们使用这种方法来验证常用研究测量方法的结构表征,并展示在解释神经心理学测试数据时,未优化的项目参数如何导致错误结论。未来的工作可能包括开发计算机化自适应测试以及与混合模型和随机效应模型的整合。