Universidad Pontificia Comillas, Madrid, Community of Madrid, Spain.
Santalucía Chair of Analytics for Education, Madid, Spain, Spain.
F1000Res. 2024 Oct 17;13:791. doi: 10.12688/f1000research.153129.2. eCollection 2024.
Large Language Models (LLMs), as in the case of OpenAI ChatGPT-4 Turbo, are revolutionizing several industries, including higher education. In this context, LLMs can be personalised through customization process to meet the student demands on every particular subject, like statistics. Recently, OpenAI launched the possibility of customizing their model with a natural language web interface, enabling the creation of customised GPT versions deliberately conditioned to meet the demands of a specific task.
This preliminary research aims to assess the potential of the customised GPTs. After developing a Business Statistics Virtual Professor (BSVP), tailored for students at the Universidad Pontificia Comillas, its behaviour was evaluated and compared with that of ChatGPT-4 Turbo. Firstly, each professor collected 15-30 genuine student questions from "Statistics and Probability" and "Business Statistics" courses across seven degrees, primarily from second-year courses. These questions, often ambiguous and imprecise, were posed to ChatGPT-4 Turbo and BSVP, with their initial responses recorded without follow-ups. In the third stage, professors blindly evaluated the responses on a 0-10 scale, considering quality, depth, and personalization. Finally, a statistical comparison of the systems' performance was conducted.
The results lead to several conclusions. Firstly, a substantial modification in the style of communication was observed. Following the instructions it was trained with, BSVP responded in a more relatable and friendly tone, even incorporating a few minor jokes. Secondly, when explicitly asked for something like, "I would like to practice a programming exercise similar to those in R practice 4," BSVP could provide a far superior response. Lastly, regarding overall performance, quality, depth, and alignment with the specific content of the course, no statistically significant differences were observed in the responses between BSVP and ChatGPT-4 Turbo.
It appears that customised assistants trained with prompts present advantages as virtual aids for students, yet they do not constitute a substantial improvement over ChatGPT-4 Turbo.
大型语言模型(如 OpenAI 的 ChatGPT-4 Turbo)正在彻底改变包括高等教育在内的多个行业。在这种情况下,可以通过定制化过程对语言模型进行个性化处理,以满足每个特定学科(如统计学)学生的需求。最近,OpenAI 推出了通过自然语言网络界面定制其模型的可能性,从而可以创建专门针对特定任务需求进行定制的 GPT 版本。
这项初步研究旨在评估定制 GPT 的潜力。在开发了一个针对 Pontificia Comillas 大学学生的商务统计学虚拟教授(BSVP)之后,评估了其行为,并将其与 ChatGPT-4 Turbo 进行了比较。首先,每位教授从“统计学和概率论”以及七个学位的“商务统计学”课程中收集了 15-30 个真实学生的问题,这些问题主要来自第二年的课程。这些问题往往含糊不清,不够精确,分别向 ChatGPT-4 Turbo 和 BSVP 提出,记录了它们的初始回复,没有后续问题。在第三阶段,教授们在 0-10 的量表上对回复进行了盲目评估,考虑了质量、深度和个性化程度。最后,对系统的性能进行了统计比较。
结果得出了几个结论。首先,观察到沟通风格发生了重大变化。BSVP 按照它接受的训练指令回复,语气更加亲切友好,甚至还开了几个小玩笑。其次,当明确要求提供类似“我想练习一个与 R 实践 4 中类似的编程练习”的内容时,BSVP 可以提供更好的回复。最后,在整体表现、质量、深度以及与课程特定内容的一致性方面,BSVP 和 ChatGPT-4 Turbo 的回复没有观察到统计学上的显著差异。
似乎使用提示进行训练的定制助手作为学生的虚拟助手具有优势,但它们并没有比 ChatGPT-4 Turbo 有显著的改进。