McIntyre Nora A
Southampton Education School, University of Southampton, Building 32, University Rd, Highfield, Southampton, SO17 1BJ UK.
Educ Inf Technol (Dordr). 2023;28(4):3787-3832. doi: 10.1007/s10639-022-11280-5. Epub 2022 Oct 4.
Access to education is the first step to benefiting from it. Although cumulative online learning experience is linked academic learning gains, between-country inequalities mean that large populations are prevented from accumulating such experience. Low-and-middle-income countries are affected by disadvantages in infrastructure such as internet access and uncontextualised learning content, and parents who are less available and less well-resourced than in high-income countries. COVID-19 has exacerbated the global inequalities, with girls affected more than boys in these regions. Therefore, the present research mined online learning data to identify features that are important for access to online learning. Data mining of 54,842,787 initial (random subsample n = 5000) data points from one online learning platform was conducted by partnering theory with data in model development. Following examination of a theory-led machine learning model, a data-led approach was taken to reach a final model. The final model was used to derive Shapley values for feature importance. As expected, country differences, gender, and COVID-19 were important features in access to online learning. The data-led model development resulted in additional insights not examined in the initial, theory-led model: namely, the importance of Math ability, year of birth, session difficulty level, month of birth, and time taken to complete a session.
获得教育是从中受益的第一步。虽然累积的在线学习经验与学术学习成果相关,但国家间的不平等意味着大量人口被阻碍积累此类经验。中低收入国家受到基础设施方面劣势的影响,比如互联网接入和脱离实际情境的学习内容,而且与高收入国家相比,这些国家的家长陪伴孩子的时间更少,资源也更少。新冠疫情加剧了全球不平等,在这些地区女孩受到的影响比男孩更大。因此,本研究挖掘在线学习数据,以确定对于获得在线学习机会而言重要的特征。通过在模型开发中将理论与数据相结合,对来自一个在线学习平台的54,842,787个初始(随机子样本n = 5000)数据点进行了数据挖掘。在对一个理论导向的机器学习模型进行检验之后,采用了数据导向的方法来得出最终模型。最终模型被用于推导特征重要性的沙普利值。不出所料,国家差异、性别和新冠疫情是获得在线学习机会的重要特征。数据导向的模型开发带来了一些在最初的理论导向模型中未考察的额外见解:即数学能力、出生年份、课程难度水平、出生月份以及完成一节课所花费的时间的重要性。