Hofman Abe D, Brinkhuis Matthieu J S, Bolsinova Maria, Klaiber Jonathan, Maris Gunter, van der Maas Han L J
Department of Psychological Methods, University of Amsterdam, 1018 WS Amsterdam, The Netherlands.
Oefenweb, 1011 VL Amsterdam, The Netherlands.
J Intell. 2020 Mar 3;8(1):10. doi: 10.3390/jintelligence8010010.
One of the highest ambitions in educational technology is the move towards personalized learning. To this end, computerized adaptive learning (CAL) systems are developed. A popular method to track the development of student ability and item difficulty, in CAL systems, is the Elo Rating System (ERS). The ERS allows for dynamic model parameters by updating key parameters after every response. However, drawbacks of the ERS are that it does not provide standard errors and that it results in rating variance inflation. We identify three statistical issues responsible for both of these drawbacks. To solve these issues we introduce a new tracking system based on urns, where every person and item is represented by an urn filled with a combination of green and red marbles. Urns are updated, by an exchange of marbles after each response, such that the proportions of green marbles represent estimates of person ability or item difficulty. A main advantage of this approach is that the standard errors are known, hence the method allows for statistical inference, such as testing for learning effects. We highlight features of the Urnings algorithm and compare it to the popular ERS in a simulation study and in an empirical data example from a large-scale CAL application.
教育技术领域的最高目标之一是朝着个性化学习迈进。为此,人们开发了计算机自适应学习(CAL)系统。在CAL系统中,一种用于跟踪学生能力发展和题目难度的常用方法是Elo评分系统(ERS)。ERS通过在每次回答后更新关键参数来实现动态模型参数。然而,ERS的缺点是它不提供标准误差,并且会导致评分方差膨胀。我们确定了导致这两个缺点的三个统计问题。为了解决这些问题,我们引入了一种基于瓮的新跟踪系统,其中每个人和每个题目都由一个装有绿色和红色弹珠组合的瓮来表示。通过在每次回答后交换弹珠来更新瓮,使得绿色弹珠的比例代表对个人能力或题目难度的估计。这种方法的一个主要优点是标准误差是已知的,因此该方法允许进行统计推断,例如检验学习效果。我们突出了瓮算法的特点,并在模拟研究和来自大规模CAL应用的实证数据示例中将其与流行的ERS进行了比较。