Akolekar Harshal, Jhamnani Piyush, Kumar Vikash, Tailor Vinay, Pote Aditya, Meena Ankit, Kumar Kamal, Challa Jagat Sesh, Kumar Dhruv
Department of Mechanical Engineering, Indian Institute of Technology, Jodhpur, 342030, India.
School of AI & Data Science, Indian Institute of Technology, Jodhpur, 342030, India.
Sci Rep. 2025 Mar 17;15(1):9214. doi: 10.1038/s41598-025-93871-z.
This study evaluates the effectiveness of three leading generative AI tools-ChatGPT, Gemini, and Copilot-in undergraduate mechanical engineering education using a mixed-methods approach. The performance of these tools was assessed on 800 questions spanning seven core subjects, covering multiple-choice, numerical, and theory-based formats. While all three AI tools demonstrated strong performance in theory-based questions, they struggled with numerical problem-solving, particularly in areas requiring deep conceptual understanding and complex calculations. Among them, Copilot achieved the highest accuracy (60.38%), followed by Gemini (57.13%) and ChatGPT (46.63%). To complement these findings, a survey of 172 students and interviews with 20 participants provided insights into user experiences, challenges, and perceptions of AI in academic settings. Thematic analysis revealed concerns regarding AI's reliability in numerical tasks and its potential impact on students' problem-solving abilities. Based on these results, this study offers strategic recommendations for integrating AI into mechanical engineering curricula, ensuring its responsible use to enhance learning without fostering dependency. Additionally, we propose instructional strategies to help educators adapt assessment methods in the era of AI-assisted learning. These findings contribute to the broader discussion on AI's role in engineering education and its implications for future learning methodologies.
本研究采用混合方法评估了三种领先的生成式人工智能工具——ChatGPT、Gemini和Copilot在本科机械工程教育中的有效性。对这些工具在涵盖七个核心学科的800个问题上的表现进行了评估,这些问题涵盖选择题、数值题和基于理论的题型。虽然这三种人工智能工具在基于理论的问题上都表现出色,但它们在解决数值问题时遇到了困难,尤其是在需要深入概念理解和复杂计算的领域。其中,Copilot的准确率最高(60.38%),其次是Gemini(57.13%)和ChatGPT(46.63%)。为补充这些发现,对172名学生进行的调查以及对20名参与者的访谈提供了有关用户在学术环境中使用人工智能的体验、挑战和看法的见解。主题分析揭示了对人工智能在数值任务中的可靠性及其对学生解决问题能力的潜在影响的担忧。基于这些结果,本研究为将人工智能整合到机械工程课程中提供了战略建议,确保其得到负责任的使用以促进学习而不助长依赖性。此外,我们提出了教学策略,以帮助教育工作者在人工智能辅助学习时代调整评估方法。这些发现有助于就人工智能在工程教育中的作用及其对未来学习方法的影响展开更广泛的讨论。