Zhang Xiaoyi, Zhang Yakang, Chen Angelina Lilac, Yu Manning, Zhang Lihao
College of Liberal Arts and Science, University of Illinois Urbana-Champaign, Urbana, IL, United States of America.
Industrial Engineering and Operations Research Department, Columbia University, New York, NY, United States of America.
PLoS One. 2025 Jan 22;20(1):e0314823. doi: 10.1371/journal.pone.0314823. eCollection 2025.
As education increasingly relies on data-driven methodologies, accurately predicting student performance is essential for implementing timely and effective interventions. The California Student Performance Dataset offers a distinctive basis for analyzing complex elements that affect educational results, such as student demographics, academic behaviours, and emotional health. This study presents the GNN-Transformer-InceptionNet (GNN-TINet) model to overcome the constraints of prior models that fail to effectively capture intricate interactions in multi-label contexts, where students may display numerous performance categories concurrently. The GNN-TINet utilizes InceptionNet, transformer architectures, and graph neural networks (GNN) to improve precision in multi-label student performance forecasting. Advanced preprocessing approaches, such as Contextual Frequency Encoding (CFI) and Contextual Adaptive Imputation (CAI), were used on a dataset of 97,000 occurrences. The model achieved exceptional outcomes, exceeding current standards with a Predictive Consistency Score (PCS) of 0.92 and an accuracy of 98.5%. Exploratory data analysis revealed significant relationships between GPA, homework completion, and parental involvement, emphasizing the complex nature of academic achievement. The results illustrate the GNN-TINet's potential to identify at-risk pupils, providing a robust resource for educators and policymakers to improve learning outcomes. This study enhances educational data mining by enabling focused interventions that promote educational equality, tackling significant challenges in the domain.
随着教育越来越依赖数据驱动的方法,准确预测学生成绩对于实施及时有效的干预措施至关重要。加利福尼亚学生成绩数据集为分析影响教育成果的复杂因素提供了独特的基础,这些因素包括学生人口统计学特征、学习行为和心理健康。本研究提出了GNN-Transformer-InceptionNet(GNN-TINet)模型,以克服先前模型的局限性,这些模型无法有效地捕捉多标签情境中的复杂交互,在这种情境下学生可能同时表现出多种成绩类别。GNN-TINet利用InceptionNet、Transformer架构和图神经网络(GNN)来提高多标签学生成绩预测的精度。对一个包含97000个记录的数据集使用了先进的预处理方法,如上下文频率编码(CFI)和上下文自适应插补(CAI)。该模型取得了优异的成果,预测一致性得分(PCS)为0.92,准确率为98.5%,超过了当前标准。探索性数据分析揭示了平均绩点、作业完成情况和家长参与度之间的显著关系,强调了学业成绩的复杂性。结果表明GNN-TINet在识别有风险学生方面的潜力,为教育工作者和政策制定者改善学习成果提供了有力资源。本研究通过实现有针对性的干预措施来促进教育公平,应对该领域的重大挑战,从而加强了教育数据挖掘。