• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用多模态数据评估用于高等教育公平且可解释预测的集成模型。

Evaluating ensemble models for fair and interpretable prediction in higher education using multimodal data.

作者信息

Arévalo-Cordovilla Felipe Emiliano, Peña Marta

机构信息

Faculty of Science and Engineering, Universidad Estatal de Milagro, Ciudadela Universitaria "Dr. Rómulo Minchala Murillo", km. 1.5 vía Milagro - Virgen de Fátima, Milagro, 091050, Ecuador.

Department of Mathematics and IOC Research Institute, Universitat Politècnica de Catalunya-BarcelonaTech, Diagonal 647, Barcelona, 08028, Spain.

出版信息

Sci Rep. 2025 Aug 11;15(1):29420. doi: 10.1038/s41598-025-15388-9.

DOI:10.1038/s41598-025-15388-9
PMID:40789907
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12339690/
Abstract

Early prediction of academic performance is vital for reducing attrition in online higher education. However, existing models often lack comprehensive data integration and comparison with state-of-the-art techniques. This study, which involved 2,225 engineering students at a public university in Ecuador, addressed these gaps. The objective was to develop a robust predictive framework by integrating Moodle interactions, academic history, and demographic data using SMOTE for class balancing. The methodology involved a comparative evaluation of seven base learners, including traditional algorithms, Random Forest, and gradient boosting ensembles (XGBoost, LightGBM), and a final stacking model, all validated using a 5-fold stratified cross-validation. While the LightGBM model emerged as the best-performing base model (Area Under the Curve (AUC) = 0.953, F1 = 0.950), the stacking ensemble (AUC = 0.835) did not offer a significant performance improvement and showed considerable instability. SHAP analysis confirmed that early grades were the most influential predictors across top models. The final model demonstrated strong fairness across gender, ethnicity, and socioeconomic status (consistency = 0.907). These findings enable institutions to identify at-risk students using state-of-the-art interpretable and fair models. These findings enable institutions to identify at-risk students using state-of-the-art, interpretable, and fair models, advancing learning analytics by validating key success predictors against contemporary benchmarks.

摘要

早期预测学业成绩对于减少在线高等教育中的退学率至关重要。然而,现有模型往往缺乏全面的数据整合以及与最先进技术的比较。这项涉及厄瓜多尔一所公立大学2225名工科学生的研究弥补了这些差距。其目标是通过使用SMOTE进行类别平衡来整合Moodle交互数据、学术历史数据和人口统计数据,从而开发一个强大的预测框架。该方法包括对七个基础学习器进行比较评估,其中包括传统算法、随机森林以及梯度提升集成学习器(XGBoost、LightGBM),并最终构建一个堆叠模型,所有这些都使用5折分层交叉验证进行验证。虽然LightGBM模型成为表现最佳的基础模型(曲线下面积(AUC)=0.953,F1=0.950),但堆叠集成模型(AUC=0.835)并未带来显著的性能提升,且表现出相当大的不稳定性。SHAP分析证实,早期成绩是所有顶级模型中最具影响力的预测因素。最终模型在性别、种族和社会经济地位方面表现出很强的公平性(一致性=0.907)。这些发现使各机构能够使用最先进的可解释且公平的模型识别有风险的学生。这些发现使各机构能够使用最先进、可解释且公平的模型识别有风险的学生,通过对照当代基准验证关键成功预测因素来推动学习分析的发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a0b/12339690/6c6dc938c2c4/41598_2025_15388_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a0b/12339690/21e0a825bff3/41598_2025_15388_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a0b/12339690/56e2c3c48157/41598_2025_15388_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a0b/12339690/b09b2371d68c/41598_2025_15388_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a0b/12339690/6c6dc938c2c4/41598_2025_15388_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a0b/12339690/21e0a825bff3/41598_2025_15388_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a0b/12339690/56e2c3c48157/41598_2025_15388_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a0b/12339690/b09b2371d68c/41598_2025_15388_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a0b/12339690/6c6dc938c2c4/41598_2025_15388_Fig4_HTML.jpg

相似文献

1
Evaluating ensemble models for fair and interpretable prediction in higher education using multimodal data.使用多模态数据评估用于高等教育公平且可解释预测的集成模型。
Sci Rep. 2025 Aug 11;15(1):29420. doi: 10.1038/s41598-025-15388-9.
2
Supervised Machine Learning Models for Predicting Sepsis-Associated Liver Injury in Patients With Sepsis: Development and Validation Study Based on a Multicenter Cohort Study.用于预测脓毒症患者脓毒症相关肝损伤的监督式机器学习模型:基于多中心队列研究的开发与验证研究
J Med Internet Res. 2025 May 26;27:e66733. doi: 10.2196/66733.
3
A Responsible Framework for Assessing, Selecting, and Explaining Machine Learning Models in Cardiovascular Disease Outcomes Among People With Type 2 Diabetes: Methodology and Validation Study.用于评估、选择和解释2型糖尿病患者心血管疾病结局机器学习模型的责任框架:方法与验证研究
JMIR Med Inform. 2025 Jun 27;13:e66200. doi: 10.2196/66200.
4
Prediction of Insulin Resistance in Nondiabetic Population Using LightGBM and Cohort Validation of Its Clinical Value: Cross-Sectional and Retrospective Cohort Study.使用LightGBM预测非糖尿病人群的胰岛素抵抗及其临床价值的队列验证:横断面和回顾性队列研究
JMIR Med Inform. 2025 Jun 13;13:e72238. doi: 10.2196/72238.
5
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
6
Interpretable prediction of hospital mortality in bleeding critically ill patients based on machine learning and SHAP.基于机器学习和SHAP对出血性危重症患者医院死亡率的可解释预测
BMC Med Inform Decis Mak. 2025 Jul 15;25(1):263. doi: 10.1186/s12911-025-03101-9.
7
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
8
Clinical prediction of intravenous immunoglobulin-resistant Kawasaki disease based on interpretable Transformer model.基于可解释Transformer模型的静脉注射免疫球蛋白抵抗性川崎病的临床预测
PLoS One. 2025 Jul 9;20(7):e0327564. doi: 10.1371/journal.pone.0327564. eCollection 2025.
9
Development of an interpretable machine learning model for frailty risk prediction in older adult care institutions: a mixed-methods, cross-sectional study in China.老年护理机构衰弱风险预测的可解释机器学习模型的开发:中国的一项混合方法横断面研究。
BMJ Open. 2025 Jul 5;15(7):e095460. doi: 10.1136/bmjopen-2024-095460.
10
Optimizing machine learning model selection for landslide susceptibility mapping: analysis of similar performance metrics and the critical role of multi-criteria evaluation.优化用于滑坡易发性制图的机器学习模型选择:相似性能指标分析及多标准评价的关键作用
Environ Sci Pollut Res Int. 2025 Jun;32(30):18434-18460. doi: 10.1007/s11356-025-36761-1. Epub 2025 Jul 24.

本文引用的文献

1
A PSO weighted ensemble framework with SMOTE balancing for student dropout prediction in smart education systems.一种用于智能教育系统中学生辍学预测的带SMOTE平衡的粒子群优化加权集成框架。
Sci Rep. 2025 May 20;15(1):17463. doi: 10.1038/s41598-025-97506-1.
2
Academic achievement prediction in higher education through interpretable modeling.通过可解释建模预测高等教育中的学业成就。
PLoS One. 2024 Sep 5;19(9):e0309838. doi: 10.1371/journal.pone.0309838. eCollection 2024.