Bjorner Jakob Bue, Chang Chih-Hung, Thissen David, Reeve Bryce B
QualityMetric Incorporated, 640 George Washington Highway, Suite 201, Lincoln, RI 02865, USA.
Qual Life Res. 2007;16 Suppl 1:95-108. doi: 10.1007/s11136-007-9168-6. Epub 2007 Feb 15.
Item banks and Computerized Adaptive Testing (CAT) have the potential to greatly improve the assessment of health outcomes. This review describes the unique features of item banks and CAT and discusses how to develop item banks. In CAT, a computer selects the items from an item bank that are most relevant for and informative about the particular respondent; thus optimizing test relevance and precision. Item response theory (IRT) provides the foundation for selecting the items that are most informative for the particular respondent and for scoring responses on a common metric. The development of an item bank is a multi-stage process that requires a clear definition of the construct to be measured, good items, a careful psychometric analysis of the items, and a clear specification of the final CAT. The psychometric analysis needs to evaluate the assumptions of the IRT model such as unidimensionality and local independence; that the items function the same way in different subgroups of the population; and that there is an adequate fit between the data and the chosen item response models. Also, interpretation guidelines need to be established to help the clinical application of the assessment. Although medical research can draw upon expertise from educational testing in the development of item banks and CAT, the medical field also encounters unique opportunities and challenges.
题库和计算机自适应测试(CAT)有潜力极大地改善健康结果评估。本综述描述了题库和CAT的独特特征,并讨论了如何开发题库。在CAT中,计算机从题库中选择与特定应答者最相关且最具信息量的题目;从而优化测试的相关性和精确性。项目反应理论(IRT)为选择对特定应答者最具信息量的题目以及在通用度量上对回答进行评分提供了基础。题库的开发是一个多阶段过程,需要对要测量的结构进行明确定义、有好的题目、对题目进行仔细的心理测量分析以及对最终的CAT进行明确规范。心理测量分析需要评估IRT模型的假设,如单维性和局部独立性;题目在不同人群亚组中的功能相同;以及数据与所选项目反应模型之间有足够的拟合度。此外,还需要建立解释指南以帮助评估的临床应用。尽管医学研究在开发题库和CAT时可以借鉴教育测试的专业知识,但医学领域也面临着独特的机遇和挑战。