Suppr超能文献

评估大型语言模型合成健康科学硕士论文的能力:案例研究

Evaluating a Large Language Model's Ability to Synthesize a Health Science Master's Thesis: Case Study.

作者信息

Joranger Pål, Rivenes Lafontan Sara, Brevik Asgeir

机构信息

Department of Nursing and Health Promotion, Faculty of Health Sciences, OsloMet - Oslo Metropolitan University, P.O. Box 4 St. Olavs plass, Oslo, N-0130, Norway, 47 67236520.

出版信息

JMIR Form Res. 2025 Jul 3;9:e73248. doi: 10.2196/73248.

Abstract

BACKGROUND

Large language models (LLMs) can aid students in mastering a new topic fast, but for the educational institutions responsible for assessing and grading the academic level of students, it can be difficult to discern whether a text has originated from a student's own cognition or has been synthesized by an LLM. Universities have traditionally relied on a submitted written thesis as proof of higher-level learning, on which to grant grades and diplomas. But what happens when LLMs are able to mimic the academic writing of subject matter experts? This is now a real dilemma. The ubiquitous availability of LLMs challenges trust in the master's thesis as evidence of subject matter comprehension and academic competencies.

OBJECTIVE

In this study, we aimed to assess the quality of rapid machine-generated papers against the standards of the health science master's program we are currently affiliated with.

METHODS

In an exploratory case study, we used ChatGPT (OpenAI) to generate 2 research papers as conceivable student submissions for master's thesis graduation from a health science master's program. One paper simulated a qualitative health science research project and another simulated a quantitative health science research project.

RESULTS

Using a stepwise approach, we prompted ChatGPT to (1) synthesize 2 credible datasets, and (2) generate 2 papers, that-in our judgment-would have been able to pass as credible medium-quality graduation research papers at the health science master's program the authors are currently affiliated with. It took 2.5 hours of iterative dialogue with ChatGPT to develop the qualitative paper and 3.5 hours to develop the quantitative paper. Making the synthetic datasets that served as a starting point for our ChatGPT-driven paper development took 1.5 and 16 hours for the qualitative and quantitative datasets, respectively. This included learning and prompt optimization, and for the quantitative dataset, it included the time it took to create tables, estimate relevant bivariate correlation coefficients, and prepare these coefficients to be read by ChatGPT.

CONCLUSIONS

Our demonstration highlights the ease with which an LLM can synthesize research data, conduct scientific analyses, and produce credible research papers required for graduation from a master's program. A clear and well-written master's thesis, citing subject matter authorities and true to the expectations for academic writing, can no longer be regarded as solid proof of either extensive study or subject matter mastery. To uphold the integrity of academic standards and the value of university diplomas, we recommend that master's programs prioritize oral examinations and school exams. This shift is now crucial to ensure a fair and rigorous assessment of higher-order learning and abilities at the master's level.

摘要

背景

大语言模型(LLMs)可以帮助学生快速掌握新主题,但对于负责评估和评定学生学术水平的教育机构来说,很难辨别一篇文章是源自学生自身的认知还是由大语言模型合成的。传统上,大学依赖提交的书面论文作为更高层次学习的证明,并据此授予成绩和文凭。但是,当大语言模型能够模仿学科专家的学术写作时会发生什么呢?这如今成了一个现实的困境。大语言模型的广泛使用对硕士论文作为学科理解和学术能力证据的可信度提出了挑战。

目的

在本研究中,我们旨在根据我们目前所属的健康科学硕士项目的标准,评估快速机器生成论文的质量。

方法

在一项探索性案例研究中,我们使用ChatGPT(OpenAI)生成2篇研究论文,作为健康科学硕士项目硕士论文毕业可能的学生提交作品。一篇论文模拟定性健康科学研究项目,另一篇模拟定量健康科学研究项目。

结果

我们采用逐步推进的方法,促使ChatGPT(1)合成2个可信的数据集,(2)生成2篇论文,在我们看来,这2篇论文能够作为作者目前所属的健康科学硕士项目可信的中等质量毕业研究论文通过审核。与ChatGPT进行2.5小时的迭代对话才完成定性论文,完成定量论文则花费了3.5小时。为我们基于ChatGPT驱动的论文开发提供起点的合成数据集,定性数据集耗时1.5小时,定量数据集耗时16小时。这包括学习和提示优化,对于定量数据集,还包括创建表格、估计相关双变量相关系数以及准备供ChatGPT读取这些系数所花费的时间。

结论

我们的演示突出了大语言模型能够多么轻松地合成研究数据、进行科学分析并生成硕士项目毕业所需的可信研究论文。一篇清晰且文笔良好、引用学科权威且符合学术写作期望的硕士论文,再也不能被视为深入学习或学科掌握的可靠证据。为了维护学术标准的完整性和大学文凭的价值,我们建议硕士项目优先考虑口试和学校考试。这种转变对于确保对硕士水平的高阶学习和能力进行公平且严格的评估至关重要。

相似文献

本文引用的文献

8
Prompt Engineering with ChatGPT: A Guide for Academic Writers.《ChatGPT 提示工程:学术写作者指南》
Ann Biomed Eng. 2023 Dec;51(12):2629-2633. doi: 10.1007/s10439-023-03272-4. Epub 2023 Jun 7.
9
Artificial intelligence hallucinations.人工智能幻觉
Crit Care. 2023 May 10;27(1):180. doi: 10.1186/s13054-023-04473-y.
10
Using AI to write scholarly publications.使用人工智能撰写学术出版物。
Account Res. 2024 Oct;31(7):715-723. doi: 10.1080/08989621.2023.2168535. Epub 2023 Jan 25.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验