• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于数据挖掘的高等教育学生保留率知识发现:机器学习算法及智利的案例研究

Knowledge Discovery for Higher Education Student Retention Based on Data Mining: Machine Learning Algorithms and Case Study in Chile.

作者信息

Palacios Carlos A, Reyes-Suárez José A, Bearzotti Lorena A, Leiva Víctor, Marchant Carolina

机构信息

Departamento de Obras Civiles, Universidad Católica del Maule, Talca 3480112, Chile.

Programa de Magíster en Gestión de Operaciones, Facultad de Ingeniería, Universidad de Talca, Curicó 3344158, Chile.

出版信息

Entropy (Basel). 2021 Apr 20;23(4):485. doi: 10.3390/e23040485.

DOI:10.3390/e23040485
PMID:33923879
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8072774/
Abstract

Data mining is employed to extract useful information and to detect patterns from often large data sets, closely related to knowledge discovery in databases and data science. In this investigation, we formulate models based on machine learning algorithms to extract relevant information predicting student retention at various levels, using higher education data and specifying the relevant variables involved in the modeling. Then, we utilize this information to help the process of knowledge discovery. We predict student retention at each of three levels during their first, second, and third years of study, obtaining models with an accuracy that exceeds 80% in all scenarios. These models allow us to adequately predict the level when dropout occurs. Among the machine learning algorithms used in this work are: decision trees, -nearest neighbors, logistic regression, naive Bayes, random forest, and support vector machines, of which the random forest technique performs the best. We detect that secondary educational score and the community poverty index are important predictive variables, which have not been previously reported in educational studies of this type. The dropout assessment at various levels reported here is valid for higher education institutions around the world with similar conditions to the Chilean case, where dropout rates affect the efficiency of such institutions. Having the ability to predict dropout based on student's data enables these institutions to take preventative measures, avoiding the dropouts. In the case study, balancing the majority and minority classes improves the performance of the algorithms.

摘要

数据挖掘用于从通常庞大的数据集中提取有用信息并检测模式,这与数据库中的知识发现和数据科学密切相关。在本研究中,我们基于机器学习算法构建模型,利用高等教育数据并指定建模中涉及的相关变量,以提取预测不同层次学生留校情况的相关信息。然后,我们利用这些信息来助力知识发现过程。我们预测学生在学习的第一年、第二年和第三年三个层次中的每个层次的留校情况,在所有情况下都获得了准确率超过80%的模型。这些模型使我们能够充分预测辍学发生时的层次。本研究使用的机器学习算法包括:决策树、k近邻、逻辑回归、朴素贝叶斯、随机森林和支持向量机,其中随机森林技术表现最佳。我们发现中学教育成绩和社区贫困指数是重要的预测变量,此前在这类教育研究中尚未有过报道。这里报告的不同层次的辍学评估对于世界各地与智利情况类似的高等教育机构是有效的,在智利,辍学率会影响这些机构的效率。能够根据学生数据预测辍学情况使这些机构能够采取预防措施,避免学生辍学。在案例研究中,平衡多数类和少数类可提高算法的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd0e/8072774/1a81930d4af2/entropy-23-00485-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd0e/8072774/a9d644a1274c/entropy-23-00485-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd0e/8072774/252209e8500d/entropy-23-00485-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd0e/8072774/41b03b3ccf27/entropy-23-00485-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd0e/8072774/1a81930d4af2/entropy-23-00485-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd0e/8072774/a9d644a1274c/entropy-23-00485-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd0e/8072774/252209e8500d/entropy-23-00485-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd0e/8072774/41b03b3ccf27/entropy-23-00485-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd0e/8072774/1a81930d4af2/entropy-23-00485-g004.jpg

相似文献

1
Knowledge Discovery for Higher Education Student Retention Based on Data Mining: Machine Learning Algorithms and Case Study in Chile.基于数据挖掘的高等教育学生保留率知识发现:机器学习算法及智利的案例研究
Entropy (Basel). 2021 Apr 20;23(4):485. doi: 10.3390/e23040485.
2
A methodology to design, develop, and evaluate machine learning models for predicting dropout in school systems: the case of Chile.一种用于设计、开发和评估预测学校系统辍学情况的机器学习模型的方法:以智利为例。
Educ Inf Technol (Dordr). 2023 Jan 23:1-47. doi: 10.1007/s10639-022-11515-5.
3
Predicting and Comparing Students' Online and Offline Academic Performance Using Machine Learning Algorithms.使用机器学习算法预测和比较学生的线上和线下学习成绩
Behav Sci (Basel). 2023 Mar 28;13(4):289. doi: 10.3390/bs13040289.
4
Comparison of Support Vector Machine, Naïve Bayes and Logistic Regression for Assessing the Necessity for Coronary Angiography.支持向量机、朴素贝叶斯和逻辑回归在评估冠状动脉造影必要性中的比较。
Int J Environ Res Public Health. 2020 Sep 4;17(18):6449. doi: 10.3390/ijerph17186449.
5
Radiogenomics of lower-grade gliomas: machine learning-based MRI texture analysis for predicting 1p/19q codeletion status.低级别胶质瘤的放射基因组学:基于机器学习的 MRI 纹理分析预测 1p/19q 缺失状态。
Eur Radiol. 2020 Feb;30(2):877-886. doi: 10.1007/s00330-019-06492-2. Epub 2019 Nov 5.
6
Using machine learning to predict student retention from socio-demographic characteristics and app-based engagement metrics.利用机器学习从社会人口统计学特征和基于应用程序的参与度指标预测学生留存率。
Sci Rep. 2023 Apr 7;13(1):5705. doi: 10.1038/s41598-023-32484-w.
7
Developing a Process for the Analysis of User Journeys and the Prediction of Dropout in Digital Health Interventions: Machine Learning Approach.开发一种用于分析用户旅程和预测数字健康干预措施中辍学的方法:机器学习方法。
J Med Internet Res. 2020 Oct 28;22(10):e17738. doi: 10.2196/17738.
8
Predicting the occurrence of surgical site infections using text mining and machine learning.利用文本挖掘和机器学习预测手术部位感染的发生。
PLoS One. 2019 Dec 13;14(12):e0226272. doi: 10.1371/journal.pone.0226272. eCollection 2019.
9
Predictive modelling and analytics of students' grades using machine learning algorithms.使用机器学习算法对学生成绩进行预测建模与分析。
Educ Inf Technol (Dordr). 2023;28(3):3027-3057. doi: 10.1007/s10639-022-11299-8. Epub 2022 Sep 8.
10
Machine Learning Methods in Computational Toxicology.计算毒理学中的机器学习方法
Methods Mol Biol. 2018;1800:119-139. doi: 10.1007/978-1-4939-7899-1_5.

引用本文的文献

1
Use of machine learning to assess factors affecting progression, retention, and graduation in first-year health professions students in Qatar: a longitudinal study.利用机器学习评估卡塔尔第一年卫生专业学生的进步、保留和毕业的影响因素:一项纵向研究。
BMC Med Educ. 2023 Nov 30;23(1):909. doi: 10.1186/s12909-023-04887-w.
2
Predictors of Romanian Psychology Students' Intention to Successfully Complete Their Courses-A Process-Based Psychology Theory Approach.罗马尼亚心理学专业学生成功完成课程意向的预测因素——基于过程的心理学理论方法
Behav Sci (Basel). 2023 Jul 1;13(7):549. doi: 10.3390/bs13070549.
3
Classifying COVID-19 based on amino acids encoding with machine learning algorithms.

本文引用的文献

1
Mining Educational Data to Predict Students' Performance through Procrastination Behavior.通过拖延行为挖掘教育数据以预测学生的表现。
Entropy (Basel). 2019 Dec 20;22(1):12. doi: 10.3390/e22010012.
2
Dropout and transfer paths: What are the risky profiles when analyzing university persistence with machine learning techniques?辍学和转学路径:使用机器学习技术分析大学保留率时,哪些是高风险人群?
PLoS One. 2019 Jun 21;14(6):e0218796. doi: 10.1371/journal.pone.0218796. eCollection 2019.
基于氨基酸编码,使用机器学习算法对新型冠状病毒肺炎进行分类。
Chemometr Intell Lab Syst. 2022 May 15;224:104535. doi: 10.1016/j.chemolab.2022.104535. Epub 2022 Mar 15.
4
Overview of Explainable Artificial Intelligence for Prognostic and Health Management of Industrial Assets Based on Preferred Reporting Items for Systematic Reviews and Meta-Analyses.基于系统评价和荟萃分析首选报告项目的工业资产预后和健康管理的可解释人工智能概述。
Sensors (Basel). 2021 Dec 1;21(23):8020. doi: 10.3390/s21238020.
5
Discriminable Multi-Label Attribute Selection for Pre-Course Student Performance Prediction.用于课前学生成绩预测的可区分多标签属性选择
Entropy (Basel). 2021 Sep 26;23(10):1252. doi: 10.3390/e23101252.
6
A New Approach to Predicting Cryptocurrency Returns Based on the Gold Prices with Support Vector Machines during the COVID-19 Pandemic Using Sensor-Related Data.基于新冠疫情期间传感器相关数据,利用支持向量机对黄金价格进行预测,提出一种新的加密货币回报率预测方法。
Sensors (Basel). 2021 Sep 21;21(18):6319. doi: 10.3390/s21186319.
7
A New Two-Stage Algorithm for Solving Optimization Problems.一种求解优化问题的新型两阶段算法。
Entropy (Basel). 2021 Apr 20;23(4):491. doi: 10.3390/e23040491.