Columbia University, New York, USA.
Ludwig Maximilian University of Munich, Munich, Germany.
Sci Rep. 2023 Apr 7;13(1):5705. doi: 10.1038/s41598-023-32484-w.
Student attrition poses a major challenge to academic institutions, funding bodies and students. With the rise of Big Data and predictive analytics, a growing body of work in higher education research has demonstrated the feasibility of predicting student dropout from readily available macro-level (e.g., socio-demographics or early performance metrics) and micro-level data (e.g., logins to learning management systems). Yet, the existing work has largely overlooked a critical meso-level element of student success known to drive retention: students' experience at university and their social embeddedness within their cohort. In partnership with a mobile application that facilitates communication between students and universities, we collected both (1) institutional macro-level data and (2) behavioral micro and meso-level engagement data (e.g., the quantity and quality of interactions with university services and events as well as with other students) to predict dropout after the first semester. Analyzing the records of 50,095 students from four US universities and community colleges, we demonstrate that the combined macro and meso-level data can predict dropout with high levels of predictive performance (average AUC across linear and non-linear models = 78%; max AUC = 88%). Behavioral engagement variables representing students' experience at university (e.g., network centrality, app engagement, event ratings) were found to add incremental predictive power beyond institutional variables (e.g., GPA or ethnicity). Finally, we highlight the generalizability of our results by showing that models trained on one university can predict retention at another university with reasonably high levels of predictive performance.
学生流失对学术机构、资助机构和学生本身都构成了重大挑战。随着大数据和预测分析的兴起,越来越多的高等教育研究工作已经证明,从现成的宏观层面(如社会人口统计学或早期绩效指标)和微观层面数据(如登录学习管理系统)预测学生流失是可行的。然而,现有研究在很大程度上忽略了一个关键的中观层面因素,即已知会影响学生留存率的因素:学生在大学的体验以及他们在同学群体中的社交融入度。我们与一个促进学生和大学之间交流的移动应用程序合作,收集了(1)机构的宏观层面数据和(2)行为的微观和中观层面参与数据(例如,与大学服务和活动以及与其他学生的互动的数量和质量),以预测第一学期后的辍学率。通过分析来自四所美国大学和社区学院的 50095 名学生的记录,我们证明,宏观和中观层面数据的组合可以以高预测性能(线性和非线性模型的平均 AUC 为 78%;最大 AUC 为 88%)预测辍学率。代表学生在大学体验的行为参与变量(例如,网络中心度、应用程序参与度、活动评分)被发现比机构变量(例如,平均绩点或族裔)具有更高的预测能力。最后,我们通过展示在一所大学训练的模型可以以相当高的预测性能预测另一所大学的保留率,突出了我们结果的通用性。