• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

眼科病例诊断方法中的人工智能与人类智能:性能与一致性的定性评估

Artificial Versus Human Intelligence in the Diagnostic Approach of Ophthalmic Case Scenarios: A Qualitative Evaluation of Performance and Consistency.

作者信息

Mandalos Achilleas, Tsouris Dimitrios

机构信息

Ophthalmology, General Hospital of Karditsa, Karditsa, GRC.

Ophthalmology, General University Hospital of Larissa, Larissa, GRC.

出版信息

Cureus. 2024 Jun 16;16(6):e62471. doi: 10.7759/cureus.62471. eCollection 2024 Jun.

DOI:10.7759/cureus.62471
PMID:39015855
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11251728/
Abstract

PURPOSE

To evaluate the efficiency of three artificial intelligence (AI) chatbots (ChatGPT-3.5 (OpenAI, San Francisco, California, United States), Bing Copilot (Microsoft Corporation, Redmond, Washington, United States), Google Gemini (Google LLC, Mountain View, California, United States)) in assisting the ophthalmologist in the diagnostic approach and management of challenging ophthalmic cases and compare their performance with that of a practicing human ophthalmic specialist. The secondary aim was to assess the short- and medium-term consistency of ChatGPT's responses.

METHODS

Eleven ophthalmic case scenarios of variable complexity were presented to the AI chatbots and to an ophthalmic specialist in a stepwise fashion. Advice regarding the initial differential diagnosis, the final diagnosis, further investigation, and management was asked for. One month later, the same process was repeated twice on the same day for ChatGPT only.

RESULTS

The individual diagnostic performance of all three AI chatbots was inferior to that of the ophthalmic specialist; however, they provided useful complementary input in the diagnostic algorithm. This was especially true for ChatGPT and Bing Copilot. ChatGPT exhibited reasonable short- and medium-term consistency, with the mean Jaccard similarity coefficient of responses varying between 0.58 and 0.76.

CONCLUSION

AI chatbots may act as useful assisting tools in the diagnosis and management of challenging ophthalmic cases; however, their responses should be scrutinized for potential inaccuracies, and by no means can they replace consultation with an ophthalmic specialist.

摘要

目的

评估三种人工智能(AI)聊天机器人(ChatGPT-3.5(美国加利福尼亚州旧金山的OpenAI公司)、必应副驾驶(美国华盛顿州雷德蒙德的微软公司)、谷歌Gemini(美国加利福尼亚州山景城的谷歌有限责任公司))在协助眼科医生诊断和管理具有挑战性的眼科病例方面的效率,并将它们的表现与执业眼科专科医生的表现进行比较。次要目的是评估ChatGPT回答的短期和中期一致性。

方法

以逐步方式向AI聊天机器人和一位眼科专家呈现11个复杂度不同的眼科病例场景。询问有关初始鉴别诊断、最终诊断、进一步检查和管理的建议。一个月后,仅对ChatGPT在同一天重复相同过程两次。

结果

所有三种AI聊天机器人的个体诊断表现均不如眼科专家;然而,它们在诊断算法中提供了有用的补充信息。ChatGPT和必应副驾驶尤其如此。ChatGPT表现出合理的短期和中期一致性,回答的平均杰卡德相似系数在0.58至0.76之间变化。

结论

AI聊天机器人在具有挑战性的眼科病例的诊断和管理中可能是有用的辅助工具;然而,应仔细审查它们的回答是否存在潜在不准确之处,而且它们绝不能取代与眼科专家的会诊。

相似文献

1
Artificial Versus Human Intelligence in the Diagnostic Approach of Ophthalmic Case Scenarios: A Qualitative Evaluation of Performance and Consistency.眼科病例诊断方法中的人工智能与人类智能:性能与一致性的定性评估
Cureus. 2024 Jun 16;16(6):e62471. doi: 10.7759/cureus.62471. eCollection 2024 Jun.
2
Comparative accuracy of ChatGPT-4, Microsoft Copilot and Google Gemini in the Italian entrance test for healthcare sciences degrees: a cross-sectional study.ChatGPT-4、微软 Copilot 和谷歌 Gemini 在意大利医疗科学学位入学考试中的比较准确性:一项横断面研究。
BMC Med Educ. 2024 Jun 26;24(1):694. doi: 10.1186/s12909-024-05630-9.
3
Performance of Artificial Intelligence Chatbots on Glaucoma Questions Adapted From Patient Brochures.人工智能聊天机器人对改编自患者手册的青光眼问题的回答情况。
Cureus. 2024 Mar 23;16(3):e56766. doi: 10.7759/cureus.56766. eCollection 2024 Mar.
4
Comparison of the Audiological Knowledge of Three Chatbots: ChatGPT, Bing Chat, and Bard.三款聊天机器人的听力学知识比较:ChatGPT、必应聊天和巴德
Audiol Neurootol. 2024;29(6):457-463. doi: 10.1159/000538983. Epub 2024 May 6.
5
Can artificial intelligence models serve as patient information consultants in orthodontics?人工智能模型能否在正畸学中充当患者信息顾问?
BMC Med Inform Decis Mak. 2024 Jul 29;24(1):211. doi: 10.1186/s12911-024-02619-8.
6
Bias and Inaccuracy in AI Chatbot Ophthalmologist Recommendations.人工智能聊天机器人眼科医生建议中的偏差与不准确之处。
Cureus. 2023 Sep 25;15(9):e45911. doi: 10.7759/cureus.45911. eCollection 2023 Sep.
7
Unlocking Health Literacy: The Ultimate Guide to Hypertension Education From ChatGPT Versus Google Gemini.解锁健康素养:ChatGPT与谷歌Gemini高血压教育终极指南
Cureus. 2024 May 8;16(5):e59898. doi: 10.7759/cureus.59898. eCollection 2024 May.
8
Efficacy of AI Chats to Determine an Emergency: A Comparison Between OpenAI's ChatGPT, Google Bard, and Microsoft Bing AI Chat.人工智能聊天工具在判定紧急情况方面的效能:OpenAI的ChatGPT、谷歌巴德和微软必应人工智能聊天工具的比较
Cureus. 2023 Sep 18;15(9):e45473. doi: 10.7759/cureus.45473. eCollection 2023 Sep.
9
The Comparison of ChatGPT 3.5, Microsoft Bing, and Google Gemini for Diagnosing Cases of Neuro-Ophthalmology.ChatGPT 3.5、微软必应和谷歌Gemini在诊断神经眼科病例方面的比较
Cureus. 2024 Apr 14;16(4):e58232. doi: 10.7759/cureus.58232. eCollection 2024 Apr.
10
Evaluation of validity and reliability of AI Chatbots as public sources of information on dental trauma.评估人工智能聊天机器人作为牙科创伤公共信息来源的有效性和可靠性。
Dent Traumatol. 2025 Apr;41(2):187-193. doi: 10.1111/edt.13000. Epub 2024 Oct 17.

引用本文的文献

1
Accuracy of Large Language Models When Answering Clinical Research Questions: Systematic Review and Network Meta-Analysis.大型语言模型回答临床研究问题的准确性:系统评价与网络荟萃分析
J Med Internet Res. 2025 Apr 30;27:e64486. doi: 10.2196/64486.

本文引用的文献

1
ChatGPT: is it good for our glaucoma patients?ChatGPT:它对我们的青光眼患者有益吗?
Front Ophthalmol (Lausanne). 2023 Nov 16;3:1260415. doi: 10.3389/fopht.2023.1260415. eCollection 2023.
2
Large language models as assistance for glaucoma surgical cases: a ChatGPT vs. Google Gemini comparison.大语言模型作为青光眼手术病例的辅助工具:ChatGPT 与 Google Gemini 的对比。
Graefes Arch Clin Exp Ophthalmol. 2024 Sep;262(9):2945-2959. doi: 10.1007/s00417-024-06470-5. Epub 2024 Apr 4.
3
Exploring AI-chatbots' capability to suggest surgical planning in ophthalmology: ChatGPT versus Google Gemini analysis of retinal detachment cases.探索 AI 聊天机器人在眼科手术规划方面的建议能力:ChatGPT 与 Google Gemini 对视网膜脱离病例的分析比较。
Br J Ophthalmol. 2024 Sep 20;108(10):1457-1469. doi: 10.1136/bjo-2023-325143.
4
Exploring Diagnostic Precision and Triage Proficiency: A Comparative Study of GPT-4 and Bard in Addressing Common Ophthalmic Complaints.探索诊断准确性和分诊能力:GPT-4与Bard处理常见眼科疾病主诉的比较研究
Bioengineering (Basel). 2024 Jan 26;11(2):120. doi: 10.3390/bioengineering11020120.
5
Performance of ChatGPT in Diagnosis of Corneal Eye Diseases.ChatGPT 在角膜眼病诊断中的表现。
Cornea. 2024 May 1;43(5):664-670. doi: 10.1097/ICO.0000000000003492. Epub 2024 Feb 23.
6
Assessment of a Large Language Model's Responses to Questions and Cases About Glaucoma and Retina Management.评估大型语言模型对青光眼和视网膜管理相关问题和病例的回答。
JAMA Ophthalmol. 2024 Apr 1;142(4):371-375. doi: 10.1001/jamaophthalmol.2023.6917.
7
Diagnostic capabilities of ChatGPT in ophthalmology.ChatGPT 在眼科诊断中的应用能力。
Graefes Arch Clin Exp Ophthalmol. 2024 Jul;262(7):2345-2352. doi: 10.1007/s00417-023-06363-z. Epub 2024 Jan 6.
8
"Application and accuracy of artificial intelligence-derived large language models in patients with age related macular degeneration".人工智能衍生的大语言模型在年龄相关性黄斑变性患者中的应用及准确性
Int J Retina Vitreous. 2023 Nov 18;9(1):71. doi: 10.1186/s40942-023-00511-7.
9
Popular large language model chatbots' accuracy, comprehensiveness, and self-awareness in answering ocular symptom queries.流行的大语言模型聊天机器人在回答眼部症状查询时的准确性、全面性和自我意识。
iScience. 2023 Oct 10;26(11):108163. doi: 10.1016/j.isci.2023.108163. eCollection 2023 Nov 17.
10
Assessment of ChatGPT in the Prehospital Management of Ophthalmological Emergencies - An Analysis of 10 Fictional Case Vignettes.ChatGPT在眼科急诊院前管理中的评估——对10个虚构病例 vignettes的分析
Klin Monbl Augenheilkd. 2024 May;241(5):675-681. doi: 10.1055/a-2149-0447. Epub 2023 Oct 27.