Gao Ailian, Liu Zenglei
School of Electrical and Information Engineering, Hunan Institute of Technology, Hengyang, Hunan, China.
PLoS One. 2025 Sep 9;20(9):e0330433. doi: 10.1371/journal.pone.0330433. eCollection 2025.
Knowledge tracing can reveal students' level of knowledge in relation to their learning performance. Recently, plenty of machine learning algorithms have been proposed to exploit to implement knowledge tracing and have achieved promising outcomes. However, most of the previous approaches were unable to cope with long sequence time-series prediction, which is more valuable than short sequence prediction that is extensively utilized in current knowledge-tracing studies. In this study, we propose a long-sequence time-series forecasting pipeline for knowledge tracing that leverages both time stamp and exercise sequences. Firstly, we introduce a bidirectional LSTM model to tackle the embeddings of exercise-answering records. Secondly, we incorporate both the students' exercising recordings and the time stamps into a vector for each record. Next, a sequence of vectors is taken as input for the proposed Informer model, which utilizes the probability-sparse self-attention mechanism. Note that the probability sparse self-attention module can address the quadratic computational complexity issue of the canonical encoder-decoder architecture. Finally, we integrate temporal information and individual knowledge states to implement the answers to a sequence of target exercises. To evaluate the performance of the proposed LSTKT model, we conducted comparison experiments with state-of-the-art knowledge tracing algorithms on a publicly available dataset. This model demonstrates quantitative improvements over existing models. In the Assistments2009 dataset, it achieved an accuracy of 78.49% and an AUC of 78.81%. For the Assistments2017 dataset, it reached an accuracy of 74.22% and an AUC of 72.82%. In the EdNet dataset, it attained an accuracy of 68.17% and an AUC of 70.78%.
知识追踪可以揭示学生的知识水平与其学习表现之间的关系。最近,人们提出了大量机器学习算法来用于实现知识追踪,并取得了不错的成果。然而,大多数先前的方法无法处理长序列时间序列预测,而长序列时间序列预测比当前知识追踪研究中广泛使用的短序列预测更有价值。在本研究中,我们提出了一种用于知识追踪的长序列时间序列预测管道,该管道利用时间戳和练习序列。首先,我们引入双向LSTM模型来处理练习答题记录的嵌入。其次,我们将学生的练习记录和时间戳都纳入到每条记录的一个向量中。接下来,将一系列向量作为所提出的Informer模型的输入,该模型利用概率稀疏自注意力机制。请注意,概率稀疏自注意力模块可以解决规范编码器 - 解码器架构的二次计算复杂度问题。最后,我们整合时间信息和个体知识状态来实现对一系列目标练习的答案预测。为了评估所提出的LSTKT模型的性能,我们在一个公开可用的数据集上与最先进的知识追踪算法进行了比较实验。该模型在定量方面比现有模型有改进。在Assistments2009数据集中,它的准确率达到78.49%,AUC为78.81%。对于Assistments2017数据集,它的准确率达到74.22%,AUC为72.82%。在EdNet数据集中,它的准确率达到68.17%,AUC为70.78%。