Suppr超能文献

STEM考试表现:大语言模型时代的开卷与闭卷方式

STEM exam performance: Open- versus closed-book methods in the large language model era.

作者信息

Mizori Rasi, Sadiq Muhayman, Ahmad Malik Takreem, Siu Anthony, Ahmad Reubeen Rashid, Yang Zijing, Oram Helen, Galloway James

机构信息

GKT School of Medicine, Faculty of Life Sciences & Medicine, King's College London, London, UK.

Brighton And Sussex Medical School, Sussex, UK.

出版信息

Clin Teach. 2025 Feb;22(1):e13839. doi: 10.1111/tct.13839. Epub 2024 Nov 4.

Abstract

BACKGROUND

The COVID-19 pandemic accelerated the shift to remote learning, heightening scrutiny of open-book examinations (OBEs) versus closed-book examinations (CBEs) within science, technology, engineering, arts and mathematics (STEM) education. This study evaluates the efficacy of OBEs compared to CBEs on student performance and perceptions within STEM subjects, considering the emerging influence of sophisticated large language models (LLMs) such as GPT-3.

METHODS

Adhering to PRISMA guidelines, this systematic review analysed peer-reviewed articles published from 2013, focusing on the impact of OBEs and CBEs on university STEM students. Standardised mean differences were assessed using a random effects model, with heterogeneity evaluated by I statistics, Cochrane's Q test and Tau statistics.

RESULTS

Analysis of eight studies revealed mixed outcomes. Meta-analysis showed that OBEs generally resulted in better scores than CBEs, despite significant heterogeneity (I = 97%). Observational studies displayed more pronounced effects, with noted concerns over technical difficulties and instances of cheating.

DISCUSSION

Results suggest that OBEs assess competencies more aligned with current educational paradigms than CBEs. However, the emergence of LLMs poses new challenges to OBE validity by simplifying the generation of comprehensive answers, impacting academic integrity and examination fairness.

CONCLUSIONS

While OBEs are better suited to contemporary educational needs, the influence of LLMs on their effectiveness necessitates further study. Institutions should prudently consider the competencies assessed by OBEs, particularly in light of evolving technological landscapes. Future research should explore the integrity of OBEs in the presence of LLMs to ensure fair and effective student evaluations.

摘要

背景

新冠疫情加速了向远程学习的转变,引发了对科学、技术、工程、艺术和数学(STEM)教育中开卷考试(OBE)与闭卷考试(CBE)的更多审视。本研究评估了开卷考试与闭卷考试相比,在STEM学科中学生成绩和认知方面的效果,同时考虑了GPT-3等复杂大语言模型(LLM)的新影响。

方法

遵循PRISMA指南,本系统综述分析了2013年以来发表的同行评议文章,重点关注开卷考试和闭卷考试对大学STEM专业学生的影响。使用随机效应模型评估标准化平均差异,通过I统计量、Cochrane Q检验和Tau统计量评估异质性。

结果

对八项研究的分析显示结果不一。荟萃分析表明,尽管存在显著异质性(I = 97%),但开卷考试的成绩总体上优于闭卷考试。观察性研究显示效果更为明显,同时指出了对技术困难和作弊情况的担忧。

讨论

结果表明,与闭卷考试相比,开卷考试对能力的评估更符合当前的教育范式。然而,大语言模型的出现给开卷考试的有效性带来了新挑战,因为它简化了全面答案的生成,影响了学术诚信和考试公平性。

结论

虽然开卷考试更适合当代教育需求,但大语言模型对其有效性的影响需要进一步研究。院校应谨慎考虑开卷考试所评估的能力,特别是鉴于技术环境的不断变化。未来的研究应探讨在大语言模型存在的情况下开卷考试的完整性,以确保对学生进行公平有效的评估。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/933d/11663729/d30a61983cb9/TCT-22-e13839-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验