失语症测验的项目反应理论分析中的模型选择和样本量。

Model choice and sample size in item response theory analysis of aphasia tests.

机构信息

VA Pittsburgh Healthcare System and University of Pittsburgh, PA, USA.

出版信息

Am J Speech Lang Pathol. 2012 May;21(2):S38-50. doi: 10.1044/1058-0360(2011/11-0090). Epub 2012 Jan 9.

DOI:10.1044/1058-0360(2011/11-0090)

PMID:22230175

Abstract

PURPOSE

The purpose of this study was to identify the most appropriate item response theory (IRT) measurement model for aphasia tests requiring 2-choice responses and to determine whether small samples are adequate for estimating such models.

METHOD

Pyramids and Palm Trees (Howard & Patterson, 1992) test data that had been collected from individuals with aphasia were analyzed, and the resulting item and person estimates were used to develop simulated test data for 3 sample size conditions. The simulated data were analyzed using a standard 1-parameter logistic (1-PL) model and 3 models that accounted for the influence of guessing: augmented 1-PL and 2-PL models and a 3-PL model. The model estimates obtained from the simulated data were compared to their known true values.

RESULTS

With small and medium sample sizes, an augmented 1-PL model was the most accurate at recovering the known item and person parameters; however, no model performed well at any sample size. Follow-up simulations confirmed that the large influence of guessing and the extreme easiness of the items contributed substantially to the poor estimation of item difficulty and person ability.

CONCLUSION

Incorporating the assumption of guessing into IRT models improves parameter estimation accuracy, even for small samples. However, caution should be exercised in interpreting scores obtained from easy 2-choice tests, regardless of whether IRT modeling or percentage correct scoring is used.

摘要

目的

本研究旨在确定最适合需要二选一反应的失语症测试的项目反应理论（IRT）测量模型，并确定小样本是否足以估计此类模型。

方法

分析了 Howard 和 Patterson（1992）的 Pyramids and Palm Trees 测试数据，这些数据是从失语症患者中收集的，所得的项目和个人估计值用于开发 3 种样本量条件下的模拟测试数据。使用标准的 1-参数逻辑（1-PL）模型和 3 种考虑猜测影响的模型（增强的 1-PL 和 2-PL 模型以及 3-PL 模型）对模拟数据进行分析。将从模拟数据中获得的模型估计值与它们的已知真实值进行比较。

结果

在小样本和中等样本量的情况下，增强的 1-PL 模型最能准确恢复已知的项目和个人参数；然而，任何模型在任何样本量下都表现不佳。后续模拟证实，猜测的巨大影响以及项目的极端简单性极大地影响了项目难度和个人能力的估计。

结论

即使在小样本中，将猜测的假设纳入 IRT 模型也可以提高参数估计的准确性。然而，无论是否使用 IRT 建模或正确百分比评分，都应谨慎解释从简单的二选一测试中获得的分数。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

失语症测验的项目反应理论分析中的模型选择和样本量。

Model choice and sample size in item response theory analysis of aphasia tests.

机构信息

出版信息

PURPOSE

METHOD

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

失语症测验的项目反应理论分析中的模型选择和样本量。

Model choice and sample size in item response theory analysis of aphasia tests.

机构信息

出版信息

PURPOSE

METHOD

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献