Suppr超能文献

基于生存分析的波动率和稀疏建模网络的学生辍学预测。

A survival analysis based volatility and sparsity modeling network for student dropout prediction.

机构信息

School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing, China.

School of Information Science and Technology, Baotou Teachers' College, Baotou, Inner Mongolia, China.

出版信息

PLoS One. 2022 May 5;17(5):e0267138. doi: 10.1371/journal.pone.0267138. eCollection 2022.

Abstract

Student Dropout Prediction (SDP) is pivotal in mitigating withdrawals in Massive Open Online Courses. Previous studies generally modeled the SDP problem as a binary classification task, providing a single prediction outcome. Accordingly, some attempts introduce survival analysis methods to achieve continuous and consistent predictions over time. However, the volatility and sparsity of data always weaken the models' performance. Prevailing solutions rely heavily on data pre-processing independent of predictive models, which are labor-intensive and may contaminate authentic data. This paper proposes a Survival Analysis based Volatility and Sparsity Modeling Network (SAVSNet) to address these issues in an end-to-end deep learning framework. Specifically, SAVSNet smooths the volatile time series by convolution network while preserving the original data information using Long-Short Term Memory Network (LSTM). Furthermore, we propose a Time-Missing-Aware LSTM unit to mitigate the impact of data sparsity by integrating informative missingness patterns into the model. A survival analysis loss function is adopted for parameter estimation, and the model outputs monotonically decreasing survival probabilities. In the experiments, we compare the proposed method with state-of-the-art methods in two real-world MOOC datasets, and the experiment results show the effectiveness of our proposed model.

摘要

学生辍学预测(SDP)对于减轻大规模在线课程的退学率至关重要。先前的研究通常将 SDP 问题建模为二分类任务,提供单一的预测结果。因此,一些研究引入了生存分析方法来实现随时间的连续和一致的预测。然而,数据的波动性和稀疏性总是会降低模型的性能。现有的解决方案主要依赖于与预测模型无关的数据预处理,这是劳动密集型的,并且可能污染真实数据。本文提出了一种基于生存分析的波动性和稀疏性建模网络(SAVSNet),以在端到端深度学习框架中解决这些问题。具体来说,SAVSNet 通过卷积网络平滑易变的时间序列,同时使用长短时记忆网络(LSTM)保留原始数据信息。此外,我们提出了一种时间缺失感知的 LSTM 单元,通过将有用的缺失模式集成到模型中,减轻数据稀疏性的影响。采用生存分析损失函数进行参数估计,模型输出单调递减的生存概率。在实验中,我们在两个真实的 MOOC 数据集上将所提出的方法与最先进的方法进行了比较,实验结果表明了我们提出的模型的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75e9/9071151/fc09fc2da2e1/pone.0267138.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验