Thellesen Line, Bergholt Thomas, Hedegaard Morten, Colov Nina Palmgren, Christensen Karl Bang, Andersen Kristine Sylvan, Sorensen Jette Led
Department of Obstetrics, The Juliane Marie Centre for Children, Women and Reproduction, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, DK-2100, Copenhagen, Denmark.
Section of Biostatistics, Department of Public Health, University of Copenhagen, Oester Farimagsgade 5, Building 15.2.12, DK-1014, Copenhagen, Denmark.
BMC Med Educ. 2017 May 18;17(1):88. doi: 10.1186/s12909-017-0915-2.
To reduce the incidence of hypoxic brain injuries among newborns a national cardiotocography (CTG) education program was implemented in Denmark. A multiple-choice question test was integrated as part of the program. The aim of this article was to describe and discuss the test development process and to introduce a feasible method for written test development in general.
The test development was based on the unitary approach to validity. The process involved national consensus on learning objectives, standardized item writing, pilot testing, sensitivity analyses, standard setting and evaluation of psychometric properties using Item Response Theory models. Test responses and feedback from midwives, specialists and residents in obstetrics and gynecology, and medical and midwifery students were used in the process (proofreaders n = 6, pilot test participants n = 118, CTG course participants n = 1679).
The final test included 30 items and the passing score was established at 25 correct answers. All items fitted a loglinear Rasch model and the test was able to discriminate levels of competence. Seven items revealed differential item functioning in relation to profession and geographical regions, which means the test is not suitable for measuring differences between midwives and physicians or differences across regions. In the setting of pilot testing Cronbach's alpha equaled 0.79, whereas Cronbach's alpha equaled 0.63 in the setting of the CTG education program. This indicates a need for more items and items with a higher degree of difficulty in the test, and illuminates the importance of context when discussing validity.
Test development is a complex and time-consuming process. The unitary approach to validity was a useful and applicable tool for development of a CTG written assessment. The process and findings supported our proposed interpretation of the assessment as measuring CTG knowledge and interpretive skills. However, for the test to function as a high-stake assessment a higher reliability is required.
为降低新生儿缺氧性脑损伤的发生率,丹麦实施了一项全国性的胎心监护(CTG)教育计划。该计划中纳入了多项选择题测试。本文旨在描述和讨论测试开发过程,并总体介绍一种可行的书面测试开发方法。
测试开发基于效度的单一方法。该过程包括就学习目标达成全国共识、标准化试题编写、预测试、敏感性分析、标准设定以及使用项目反应理论模型评估心理测量特性。在此过程中使用了助产士、妇产科专家和住院医师以及医学和助产专业学生的测试回答及反馈(校对人员n = 6,预测试参与者n = 118,CTG课程参与者n = 1679)。
最终测试包括30道题目,及格分数设定为25道正确答案。所有题目均符合对数线性Rasch模型,该测试能够区分能力水平。7道题目在职业和地理区域方面显示出项目功能差异,这意味着该测试不适用于衡量助产士和医生之间的差异或不同地区之间的差异。在预测试环境中,Cronbach's α等于0.79,而在CTG教育计划环境中,Cronbach's α等于0.63。这表明测试需要更多题目以及难度更高的题目,并凸显了在讨论效度时背景的重要性。
测试开发是一个复杂且耗时的过程。效度的单一方法是开发CTG书面评估的有用且适用的工具。该过程和结果支持了我们对该评估的提议解释,即评估CTG知识和解释技能。然而,要使该测试作为高风险评估发挥作用,则需要更高的可靠性。