Zhou Jing, Ye Zhanliang, Zhang Sheng, Geng Zhao, Han Ning, Yang Tao
Collaborative Innovation Center of Assessment Towards Basic Education Quality, Beijing Normal University, No. 19, XinJieKouWai St., HaiDian District, Beijing, 100875, PR China Beijing, China.
Heliyon. 2024 Aug 10;10(16):e35945. doi: 10.1016/j.heliyon.2024.e35945. eCollection 2024 Aug 30.
The process data in computer-based problem-solving evaluation is rich in valuable implicit information. However, its diverse and irregular structure poses challenges for effective feature extraction, leading to varying degrees of information loss in existing methods. Process-response behavior exhibits similarities to textual data in terms of the key units and contextual relationships. Despite the scarcity of relevant research, exploring text analysis methods for feature recognition in process data is significant. This study investigated the efficacy of Term Frequency-Inverse Document Frequency (TF-IDF) and Word to Vector (Word2vec) in extracting response behavior features and compared the predictive, analytical, and clustering effects of classical machine learning methods (supervised and unsupervised) on response behavior. An analysis of the PISA 2012 computer-based problem-solving dataset revealed that TF-IDF effectively extracted key response behaviors, whereas Word2vec captured effective features from sequenced response behaviors. In addition, in supervised machine learning using both methods, the random forest model based on TF-IDF performed the best, followed by the SVM model based on Word2vec. Word2vec-based models outperformed TF-IDF-based ones in the F1-score, accuracy, and recall (except for precision) across the logistic regression, k-nearest neighbor, and support vector machine algorithms. In unsupervised machine learning, the k-means algorithm effectively clustered different response behavior patterns extracted by these methods. The findings underscore the theoretical and methodological transferability of these text analysis methods in educational and psychological assessment contexts. This study offers valuable insights for research and practice in similar domains by yielding rich feature representations, supplementing fine-grained assessment evidence, fostering personalized learning, and introducing novel insights for educational assessment.
基于计算机的问题解决评估中的过程数据包含丰富的有价值的隐含信息。然而,其多样且不规则的结构给有效的特征提取带来了挑战,导致现有方法存在不同程度的信息损失。过程响应行为在关键单元和上下文关系方面与文本数据表现出相似性。尽管相关研究较少,但探索用于过程数据特征识别的文本分析方法具有重要意义。本研究调查了词频-逆文档频率(TF-IDF)和词向量(Word2vec)在提取响应行为特征方面的有效性,并比较了经典机器学习方法(监督式和非监督式)对响应行为的预测、分析和聚类效果。对2012年国际学生评估项目(PISA)基于计算机的问题解决数据集的分析表明,TF-IDF有效地提取了关键响应行为,而Word2vec从序列响应行为中捕获了有效特征。此外,在使用这两种方法的监督式机器学习中,基于TF-IDF的随机森林模型表现最佳,其次是基于Word2vec的支持向量机模型。在逻辑回归、k近邻和支持向量机算法中,基于Word2vec的模型在F1分数、准确率和召回率(除精确率外)方面优于基于TF-IDF的模型。在非监督式机器学习中,k均值算法有效地对这些方法提取的不同响应行为模式进行了聚类。研究结果强调了这些文本分析方法在教育和心理评估背景下的理论和方法可转移性。本研究通过产生丰富的特征表示、补充细粒度评估证据、促进个性化学习以及为教育评估引入新见解,为类似领域的研究和实践提供了有价值的见解。