College of Education, Inha University, Incheon, Korea.
J Educ Eval Health Prof. 2024;21:23. doi: 10.3352/jeehp.2024.21.23. Epub 2024 Sep 12.
Computerized adaptive testing (CAT) has become a widely adopted test design for high-stakes licensing and certification exams, particularly in the health professions in the United States, due to its ability to tailor test difficulty in real time, reducing testing time while providing precise ability estimates. A key component of CAT is item response theory (IRT), which facilitates the dynamic selection of items based on examinees' ability levels during a test. Accurate estimation of item and ability parameters is essential for successful CAT implementation, necessitating convenient and reliable software to ensure precise parameter estimation. This paper introduces the irtQ R package (http://CRAN.R-project.org/), which simplifies IRTbased analysis and item calibration under unidimensional IRT models. While it does not directly simulate CAT, it provides essential tools to support CAT development, including parameter estimation using marginal maximum likelihood estimation via the expectation-maximization algorithm, pretest item calibration through fixed item parameter calibration and fixed ability parameter calibration methods, and examinee ability estimation. The package also enables users to compute item and test characteristic curves and information functions necessary for evaluating the psychometric properties of a test. This paper illustrates the key features of the irtQ package through examples using simulated datasets, demonstrating its utility in IRT applications such as test data analysis and ability scoring. By providing a user-friendly environment for IRT analysis, irtQ significantly enhances the capacity for efficient adaptive testing research and operations. Finally, the paper highlights additional core functionalities of irtQ, emphasizing its broader applicability to the development and operation of IRT-based assessments.
计算机化自适应测验(CAT)已成为美国高风险许可和认证考试中广泛采用的测试设计,这主要是因为其能够实时调整测试难度,在减少测试时间的同时提供精确的能力估计。CAT 的一个关键组成部分是项目反应理论(IRT),它可以根据考生在测试中的能力水平动态选择项目。准确估计项目和能力参数对于成功实施 CAT 至关重要,这需要方便可靠的软件来确保精确的参数估计。本文介绍了 irtQ R 包(http://CRAN.R-project.org/),它简化了基于 IRT 的分析和一维 IRT 模型下的项目校准。虽然它不直接模拟 CAT,但它提供了支持 CAT 开发的基本工具,包括通过期望最大化算法的边际最大似然估计进行参数估计、通过固定项目参数校准和固定能力参数校准方法进行预测试项目校准,以及考生能力估计。该包还使用户能够计算项目和测试特征曲线以及信息函数,这些是评估测试心理测量特性所必需的。本文通过使用模拟数据集的示例说明了 irtQ 包的主要功能,展示了它在 IRT 应用中的效用,如测试数据分析和能力评分。通过为 IRT 分析提供用户友好的环境,irtQ 大大增强了高效自适应测试研究和操作的能力。最后,本文强调了 irtQ 的其他核心功能,强调了它在基于 IRT 的评估的开发和操作中的更广泛适用性。