Martin Alex F, Tubaltseva Svitlana, Harrison Anja, Rubin G James
Department of Psychological Medicine, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London WC2R 2LS, UK.
School of Liberal Arts, Richmond American University London, London W4 5AN, UK.
Behav Sci (Basel). 2025 Jun 12;15(6):808. doi: 10.3390/bs15060808.
Generative AI tools offer opportunities for enhancing learning and assessment, but raise concerns about equity, academic integrity, and the ability to critically engage with AI-generated content. This study explores these issues within a psychology-oriented postgraduate programme at a UK university. We co-designed and evaluated a novel AI-integrated assessment aimed at improving critical AI literacy among students and teaching staff (pre-registration: osf.io/jqpce). Students were randomly allocated to two groups: the 'compliant' group used AI tools to assist with writing a blog and critically reflected on the outputs, while the 'unrestricted' group had free rein to use AI to produce the assessment. Teaching staff, blinded to group allocation, marked the blogs using an adapted rubric. Focus groups, interviews, and workshops were conducted to assess the feasibility, acceptability, and perceived integrity of the approach. Findings suggest that, when carefully scaffolded, integrating AI into assessments can promote both technical fluency and ethical reflection. A key contribution of this study is its participatory co-design and evaluation method, which was effective and transferable, and is presented as a practical toolkit for educators. This approach supports growing calls for authentic assessment that mirrors real-world tasks, while highlighting the ongoing need to balance academic integrity with skill development.
生成式人工智能工具为加强学习和评估提供了机会,但引发了对公平性、学术诚信以及批判性参与人工智能生成内容能力的担忧。本研究在英国一所大学的心理学研究生课程中探讨了这些问题。我们共同设计并评估了一项新颖的人工智能整合评估,旨在提高学生和教职员工的批判性人工智能素养(预注册:osf.io/jqpce)。学生被随机分为两组:“合规”组使用人工智能工具协助撰写博客并对输出进行批判性反思,而“无限制”组可以自由使用人工智能来完成评估。对分组不知情的教职员工使用改编后的评分标准对博客进行评分。通过焦点小组、访谈和研讨会来评估该方法的可行性、可接受性和感知到的诚信度。研究结果表明,在精心搭建支架的情况下,将人工智能整合到评估中可以促进技术流畅性和道德反思。本研究的一个关键贡献是其参与式共同设计和评估方法,该方法有效且可推广,并作为教育工作者的实用工具包呈现。这种方法支持了对反映现实世界任务的真实评估的日益增长的呼声,同时强调了在学术诚信与技能发展之间保持平衡的持续必要性。