Tan Chunxi, Han Ruijian, Ye Rougang, Chen Kani
The Hong Kong University of Science and Technology, Kowloon, Hong Kong.
Appl Psychol Meas. 2020 Jun;44(4):251-266. doi: 10.1177/0146621619858674. Epub 2019 Jul 25.
Personalized recommendation system has been widely adopted in E-learning field that is adaptive to each learner's own learning pace. With full utilization of learning behavior data, psychometric assessment models keep track of the learner's proficiency on knowledge points, and then, the well-designed recommendation strategy selects a sequence of actions to meet the objective of maximizing learner's learning efficiency. This article proposes a novel adaptive recommendation strategy under the framework of reinforcement learning. The proposed strategy is realized by the deep Q-learning algorithms, which are the techniques that contributed to the success of AlphaGo Zero to achieve the super-human level in playing the game of go. The proposed algorithm incorporates an early stopping to account for the possibility that learners may choose to stop learning. It can properly deal with missing data and can handle more individual-specific features for better recommendations. The recommendation strategy guides individual learners with efficient learning paths that vary from person to person. The authors showcase concrete examples with numeric analysis of substantive learning scenarios to further demonstrate the power of the proposed method.
个性化推荐系统已在电子学习领域广泛应用,该系统能适应每个学习者自身的学习节奏。通过充分利用学习行为数据,心理测量评估模型跟踪学习者在知识点上的熟练程度,然后,精心设计的推荐策略选择一系列行动,以实现最大化学习者学习效率的目标。本文提出了一种强化学习框架下的新型自适应推荐策略。所提出的策略通过深度Q学习算法实现,这些技术促成了AlphaGo Zero在围棋游戏中达到超人水平的成功。所提出的算法纳入了早期停止机制,以考虑学习者可能选择停止学习的可能性。它可以妥善处理缺失数据,并能处理更多个体特定特征以实现更好的推荐。该推荐策略为个体学习者指引因人而异的高效学习路径。作者展示了具体示例,并对实质性学习场景进行了数值分析,以进一步证明所提方法的效能。