Bahir Daniel, Zur Omri, Attal Leah, Nujeidat Zaki, Knaanie Ariela, Pikkel Joseph, Mimouni Michael, Plopsky Gilad
Department of Ophthalmology, Tzafon Medical Center, Poriya, Israel.
Azrieli Faculty of Medicine, Bar Ilan University, Safed, Israel.
Graefes Arch Clin Exp Ophthalmol. 2025 Feb;263(2):527-536. doi: 10.1007/s00417-024-06625-4. Epub 2024 Sep 15.
The rapid advancement of artificial intelligence (AI), particularly in large language models like ChatGPT and Google's Gemini AI, marks a transformative era in technological innovation. This study explores the potential of AI in ophthalmology, focusing on the capabilities of ChatGPT and Gemini AI. While these models hold promise for medical education and clinical support, their integration requires comprehensive evaluation. This research aims to bridge a gap in the literature by comparing Gemini AI and ChatGPT, assessing their performance against ophthalmology residents using a dataset derived from ophthalmology board exams.
A dataset comprising 600 questions across 12 subspecialties was curated from Israeli ophthalmology residency exams, encompassing text and image-based formats. Four AI models - ChatGPT-3.5, ChatGPT-4, Gemini, and Gemini Advanced - underwent testing on this dataset. The study includes a comparative analysis with Israeli ophthalmology residents, employing specific metrics for performance assessment.
Gemini Advanced demonstrated superior performance with a 66% accuracy rate. Notably, ChatGPT-4 exhibited improvement at 62%, Gemini at 58%, and ChatGPT-3.5 served as the reference at 46%. Comparative analysis with residents offered insights into AI models' performance relative to human-level medical knowledge. Further analysis delved into yearly performance trends, topic-specific variations, and the impact of images on chatbot accuracy.
The study unveils nuanced AI model capabilities in ophthalmology, emphasizing domain-specific variations. The superior performance of Gemini Advanced superior performance indicates significant advancements, while ChatGPT-4's improvement is noteworthy. Both Gemini and ChatGPT-3.5 demonstrated commendable performance. The comparative analysis underscores AI's evolving role as a supplementary tool in medical education. This research contributes vital insights into AI effectiveness in ophthalmology, highlighting areas for refinement. As AI models evolve, targeted improvements can enhance adaptability across subspecialties, making them valuable tools for medical professionals and enriching patient care.
What is known AI breakthroughs, like ChatGPT and Google's Gemini AI, are reshaping healthcare. In ophthalmology, AI integration has overhauled clinical workflows, particularly in analyzing images for diseases like diabetic retinopathy and glaucoma. What is new This study presents a pioneering comparison between Gemini AI and ChatGPT, evaluating their performance against ophthalmology residents using a meticulously curated dataset derived from real-world ophthalmology board exams. Notably, Gemini Advanced demonstrates superior performance, showcasing substantial advancements, while the evolution of ChatGPT-4 also merits attention. Both models exhibit commendable capabilities. These findings offer crucial insights into the efficacy of AI in ophthalmology, shedding light on areas ripe for further enhancement and optimization.
人工智能(AI)的迅速发展,尤其是在ChatGPT和谷歌的Gemini AI等大型语言模型方面,标志着技术创新的变革时代。本研究探讨了AI在眼科领域的潜力,重点关注ChatGPT和Gemini AI的能力。虽然这些模型在医学教育和临床支持方面具有前景,但它们的整合需要全面评估。本研究旨在通过比较Gemini AI和ChatGPT,利用来自眼科委员会考试的数据集评估它们相对于眼科住院医师的表现,弥合文献中的差距。
从以色列眼科住院医师考试中整理出一个包含600个问题的数据集,涵盖12个亚专业,包括基于文本和图像的格式。四个AI模型——ChatGPT-3.5、ChatGPT-4、Gemini和Gemini Advanced——在这个数据集上进行了测试。该研究包括与以色列眼科住院医师的比较分析,采用特定指标进行性能评估。
Gemini Advanced表现卓越,准确率达66%。值得注意的是,ChatGPT-4的准确率为62%,有所提高,Gemini为58%,ChatGPT-3.5作为参考,准确率为46%。与住院医师的比较分析提供了关于AI模型相对于人类医学知识表现的见解。进一步分析深入研究了年度性能趋势、特定主题的差异以及图像对聊天机器人准确性的影响。
该研究揭示了AI模型在眼科领域的细微差别,强调了特定领域的差异。Gemini Advanced的卓越表现表明了重大进展,而ChatGPT-4的进步也值得关注。Gemini和ChatGPT-3.5都表现出了值得称赞的性能。比较分析强调了AI作为医学教育辅助工具的不断演变的作用。这项研究为AI在眼科领域的有效性提供了重要见解,突出了有待改进的领域。随着AI模型的发展,有针对性的改进可以提高各亚专业的适应性,使其成为医疗专业人员的宝贵工具,并丰富患者护理。
已知的情况AI突破,如ChatGPT和谷歌的Gemini AI,正在重塑医疗保健。在眼科领域,AI的整合彻底改变了临床工作流程,特别是在分析糖尿病视网膜病变和青光眼等疾病的图像方面。新的情况本研究对Gemini AI和ChatGPT进行了开创性的比较,使用精心整理的来自实际眼科委员会考试的数据集评估它们相对于眼科住院医师的表现。值得注意的是,Gemini Advanced表现卓越,展示了重大进展,而ChatGPT-4的演变也值得关注。两个模型都表现出了值得称赞的能力。这些发现为AI在眼科领域的功效提供了关键见解,揭示了有待进一步加强和优化的领域。