Jiang Zhehan, Shi Dexin, Distefano Christine
Peking University, Beijing, China.
University of South Carolina, Columbia, SC, USA.
Educ Psychol Meas. 2021 Dec;81(6):1221-1233. doi: 10.1177/0013164421992112. Epub 2021 Feb 8.
The costs of an objective structured clinical examination (OSCE) are of concern to health profession educators globally. As OSCEs are usually designed under generalizability theory (G-theory) framework, this article proposes a machine-learning-based approach to optimize the costs, while maintaining the minimum required generalizability coefficient, a reliability-like index in G-theory. The authors adopted G-theory parameters yielded from an OSCE hosted by a medical school, reproduced the generalizability coefficients to prepare for optimizing manipulations, applied simulated annealing algorithm to calculate the number of facet levels minimizing the associated costs, and conducted the analysis in various conditions via computer simulation. With a given generalizability coefficient, the proposed approach, virtually an instrument of decision-making supports, found the optimal solution for the OSCE such that the associated costs were minimized. The computer simulation results showed how the cost reductions varied with different levels of required generalizability coefficients. Machine learning-based approaches can be used in conjunction with psychometric modeling to help planning assessment tasks more scientifically. The proposed approach is easy to adopt into practice and customize in alignment with specific testing designs. While these results are encouraging, the possible pitfalls such as algorithmic convergences' failure and inadequate cost assumptions should also be avoided.
客观结构化临床考试(OSCE)的成本是全球卫生专业教育工作者所关注的问题。由于OSCE通常是在概化理论(G理论)框架下设计的,本文提出了一种基于机器学习的方法来优化成本,同时保持所需的最小概化系数,这是G理论中一个类似可靠性的指标。作者采用了一所医学院举办的OSCE得出的G理论参数,重现概化系数以准备进行优化操作,应用模拟退火算法计算使相关成本最小化的侧面水平数量,并通过计算机模拟在各种条件下进行分析。在给定概化系数的情况下,所提出的方法实际上是一种决策支持工具,找到了OSCE的最优解,从而使相关成本最小化。计算机模拟结果显示了成本降低如何随所需概化系数的不同水平而变化。基于机器学习的方法可以与心理测量建模结合使用,以帮助更科学地规划评估任务。所提出的方法易于应用于实践,并可根据特定测试设计进行定制。虽然这些结果令人鼓舞,但也应避免算法收敛失败和成本假设不足等可能的陷阱。