Department of Psychiatry, University of Saskatchewan, Saskatoon, Canada.
PLoS One. 2018 Nov 26;13(11):e0207919. doi: 10.1371/journal.pone.0207919. eCollection 2018.
With the high cost of the research assessment exercises in the UK, many have called for simpler and less time-consuming alternatives. In this work, we gathered publicly available REF data, combined them with library-subscribed data, and used machine learning to examine whether the overall result of the Research Excellence Framework 2014 could be replicated. A Bayesian additive regression tree model predicting university grade point average (GPA) from an initial set of 18 candidate explanatory variables was developed. One hundred and nine universities were randomly divided into a training set (n = 79) and test set (n = 30). The model "learned" associations between GPA and the other variables in the training set and was made to predict the GPA of universities in the test set. GPA could be predicted from just three variables: the number of Web of Science documents, entry tariff, and percentage of students coming from state schools (r-squared = .88). Implications of this finding are discussed and proposals are given.
随着英国研究评估工作成本的增加,许多人呼吁采用更简单、耗时更少的替代方案。在这项工作中,我们收集了公开可用的 REF 数据,将其与图书馆订阅数据相结合,并使用机器学习来检验 2014 年研究卓越框架的总体结果是否可以复制。我们开发了一个贝叶斯加法回归树模型,该模型从一组 18 个候选解释变量中预测大学平均绩点 (GPA)。109 所大学被随机分为训练集 (n = 79) 和测试集 (n = 30)。该模型在训练集中“学习”了 GPA 与其他变量之间的关联,并被要求预测测试集中大学的 GPA。仅通过三个变量就可以预测 GPA:Web of Science 文档数量、入学关税和来自公立学校的学生比例(r-squared =.88)。讨论了这一发现的意义,并提出了建议。