Alazmi Meshari, Ayub Nasir
College of Computer Science and Engineering, University of Ha'il, Ha'il, Saudi Arabia.
Department of Creative Technoloiges, Air University Islamabad, Islamabad, Pakistan.
PLoS One. 2025 Jun 30;20(6):e0326966. doi: 10.1371/journal.pone.0326966. eCollection 2025.
Predicting student performance is crucial for providing personalized support and enhancing academic performance. Advanced machine-learning approaches are being used to understand student performance variables as educational data grows. A big dataset from several Chinese institutions and high schools is used to develop a credible student performance prediction technique. Moreover, the dataset includes 80 features and 200,000 records, and consequently, it represents one of the most extensive data collections available for educational research. Initially, data is passed through preprocessing to address outliers and missing values. In addition, we developed a novel hybrid feature selection model that combined correlation filtering with mutual information, Cross-Validation (CV) along with Recursive Feature Eliminatio (RFE) (R, and stability selection to identify the most impactful features. Moreover, This study develops the proposed EffiXNet, a more refined version of EfficientNet augmented with self-attention mechanisms, dynamic convolutions, improved normalization methods, and Sparrow Search Optimization Algorithm for hyperparameter optimization. The developed model was tested using an 80/20 train-test split, where 160,000 records were used for training and 40,000 for testing. The results reported, including accuracy, precision, recall, and F1-score, are based on the full test dataset. However, for better visualization, the confusion matrices display only a representative subset of test results. Furthermore, the EffiXNet value of AUC amounting to 0.99, a 25% reduction of logarithmic loss relative to the baseline models, precision of 97.8%, F1-score of 98.1%, and reliable optimization of memory usage. Significantly, the developed model showed a consistently high-performance level demonstrated by various metrics, which indicates that it is proficient in capturing intricate data patterns. The key insights the current research provides are the necessity of early intervention and directed training support in the educational domain. The EffiXNet framework offers a robust, scalable, and efficient solution for predicting student performance, with potential applications in academic institutions worldwide.
预测学生成绩对于提供个性化支持和提高学业成绩至关重要。随着教育数据的增长,先进的机器学习方法正被用于理解学生成绩变量。来自中国几所院校和高中的一个大型数据集被用于开发一种可靠的学生成绩预测技术。此外,该数据集包含80个特征和200,000条记录,因此,它是可用于教育研究的最广泛的数据集合之一。最初,数据经过预处理以处理异常值和缺失值。此外,我们开发了一种新颖的混合特征选择模型,该模型将相关滤波与互信息、交叉验证(CV)以及递归特征消除(RFE)相结合,并通过稳定性选择来识别最具影响力的特征。此外,本研究开发了所提出的EffiXNet,它是EfficientNet的更精细版本,增强了自注意力机制、动态卷积、改进的归一化方法以及用于超参数优化的麻雀搜索优化算法。所开发的模型使用80/20的训练-测试分割进行测试,其中160,000条记录用于训练,40,000条用于测试。所报告的结果,包括准确率、精确率、召回率和F1分数,基于完整的测试数据集。然而,为了更好地可视化,混淆矩阵仅显示测试结果的一个代表性子集。此外,EffiXNet的AUC值达到0.99,相对于基线模型对数损失降低了25%,精确率为97.8%,F1分数为98.1%,并且内存使用得到了可靠优化。值得注意的是,所开发的模型在各种指标上都表现出持续的高性能水平,这表明它擅长捕捉复杂的数据模式。当前研究提供的关键见解是教育领域早期干预和定向训练支持的必要性。EffiXNet框架为预测学生成绩提供了一个强大、可扩展且高效的解决方案,在全球学术机构中具有潜在应用。