• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
AI Chatbots in Answering Questions Related to Ocular Oncology: A Comparative Study Between DeepSeek v3, ChatGPT-4o, and Gemini 2.0.人工智能聊天机器人在回答与眼部肿瘤学相关问题中的应用:DeepSeek v3、ChatGPT-4o和Gemini 2.0的比较研究
Cureus. 2025 Aug 22;17(8):e90773. doi: 10.7759/cureus.90773. eCollection 2025 Aug.
2
Artificial intelligence-generated informed patient consent in various ophthalmological procedures: A comparative study of correctness, completeness, readability, and real-word application between Deepseek and Chatgpt 4o.
Indian J Ophthalmol. 2025 Oct 1;73(10):1466-1470. doi: 10.4103/IJO.IJO_1126_25. Epub 2025 Sep 25.
3
Performance of Advanced Artificial Intelligence Models in Pulp Therapy for Immature Permanent Teeth: A Comparison of ChatGPT-4 Omni, DeepSeek, and Gemini Advanced in Accuracy, Completeness, Response Time, and Readability.先进人工智能模型在年轻恒牙牙髓治疗中的表现:ChatGPT-4 Omni、DeepSeek和Gemini Advanced在准确性、完整性、响应时间和可读性方面的比较
J Endod. 2025 Aug 22. doi: 10.1016/j.joen.2025.08.011.
4
Artificial Intelligence Chatbots in Pediatric Emergencies: A Reliable Lifeline or a Risk?儿科急诊中的人工智能聊天机器人:可靠的生命线还是风险?
Cureus. 2025 Aug 1;17(8):e89234. doi: 10.7759/cureus.89234. eCollection 2025 Aug.
5
Evaluating ChatGPT and DeepSeek in postdural puncture headache management: a comparative study with international consensus guidelines.评估ChatGPT和DeepSeek在硬膜穿刺后头痛管理中的应用:与国际共识指南的对比研究
BMC Neurol. 2025 Jul 1;25(1):264. doi: 10.1186/s12883-025-04280-8.
6
Artificial Intelligence in Postoperative Sinus Care: A Comparative Study of ChatGPT-4, Google Gemini, and DeepSeek in Patient Education and Support.
J Craniofac Surg. 2026;37(3-4):490-493. doi: 10.1097/SCS.0000000000011922. Epub 2025 Sep 25.
7
Assessing the Role of Large Language Models Between ChatGPT and DeepSeek in Asthma Education for Bilingual Individuals: Comparative Study.评估ChatGPT和DeepSeek之间的大型语言模型在双语个体哮喘教育中的作用:比较研究
JMIR Med Inform. 2025 Aug 13;13:e65365. doi: 10.2196/65365.
8
Artificial Intelligence in Peripheral Artery Disease Education: A Battle Between ChatGPT and Google Gemini.外周动脉疾病教育中的人工智能:ChatGPT与谷歌Gemini的较量
Cureus. 2025 Jun 1;17(6):e85174. doi: 10.7759/cureus.85174. eCollection 2025 Jun.
9
How Accurate Is AI? A Critical Evaluation of Commonly Used Large Language Models in Responding to Patient Concerns About Incidental Kidney Tumors.人工智能的准确性如何?对常用大语言模型回应患者对偶然发现的肾肿瘤担忧的批判性评估。
J Clin Med. 2025 Aug 12;14(16):5697. doi: 10.3390/jcm14165697.
10
ChatGPT-4.0 or DeepSeek-V3? Comparative analysis of answers to the most frequently asked questions by total knee replacement candidate patients.ChatGPT-4.0还是DeepSeek-V3?全膝关节置换候选患者常见问题答案的比较分析。
Medicine (Baltimore). 2025 Aug 22;104(34):e43951. doi: 10.1097/MD.0000000000043951.

本文引用的文献

1
Chinese generative AI models (DeepSeek and Qwen) rival ChatGPT-4 in ophthalmology queries with excellent performance in Arabic and English.中国生成式人工智能模型(通义千问和文心一言)在眼科问题查询方面可与ChatGPT-4相媲美,在阿拉伯语和英语方面表现出色。
Narra J. 2025 Apr;5(1):e2371. doi: 10.52225/narra.v5i1.2371. Epub 2025 Apr 8.
2
Comparative accuracy of ChatGPT-4, Microsoft Copilot and Google Gemini in the Italian entrance test for healthcare sciences degrees: a cross-sectional study.ChatGPT-4、微软 Copilot 和谷歌 Gemini 在意大利医疗科学学位入学考试中的比较准确性:一项横断面研究。
BMC Med Educ. 2024 Jun 26;24(1):694. doi: 10.1186/s12909-024-05630-9.
3
Exploring the Role of ChatGPT in Cardiology: A Systematic Review of the Current Literature.探索ChatGPT在心脏病学中的作用:当前文献的系统综述
Cureus. 2024 Apr 24;16(4):e58936. doi: 10.7759/cureus.58936. eCollection 2024 Apr.
4
Generative AI in healthcare: an implementation science informed translational path on application, integration and governance.生成式人工智能在医疗保健领域的应用、整合和治理:基于实施科学的转化途径。
Implement Sci. 2024 Mar 15;19(1):27. doi: 10.1186/s13012-024-01357-9.
5
Systematic analysis of ChatGPT, Google search and Llama 2 for clinical decision support tasks.系统分析 ChatGPT、Google 搜索和 Llama 2 在临床决策支持任务中的应用。
Nat Commun. 2024 Mar 6;15(1):2050. doi: 10.1038/s41467-024-46411-8.
6
Utility of artificial intelligence-based large language models in ophthalmic care.人工智能大型语言模型在眼科护理中的应用。
Ophthalmic Physiol Opt. 2024 May;44(3):641-671. doi: 10.1111/opo.13284. Epub 2024 Feb 25.
7
The Use of Machine Learning for Image Analysis Artificial Intelligence in Clinical Microbiology.机器学习在临床微生物学中的图像分析人工智能的应用。
J Clin Microbiol. 2023 Sep 21;61(9):e0233621. doi: 10.1128/jcm.02336-21. Epub 2023 Jul 3.
8
Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models.ChatGPT在美国医师执照考试中的表现:使用大语言模型进行人工智能辅助医学教育的潜力。
PLOS Digit Health. 2023 Feb 9;2(2):e0000198. doi: 10.1371/journal.pdig.0000198. eCollection 2023 Feb.
9
Artificial intelligence technologies and compassion in healthcare: A systematic scoping review.医疗保健中的人工智能技术与人文关怀:一项系统综述。
Front Psychol. 2023 Jan 17;13:971044. doi: 10.3389/fpsyg.2022.971044. eCollection 2022.
10
Artificial intelligence promotes the diagnosis and screening of diabetic retinopathy.人工智能促进糖尿病视网膜病变的诊断和筛查。
Front Endocrinol (Lausanne). 2022 Sep 29;13:946915. doi: 10.3389/fendo.2022.946915. eCollection 2022.

人工智能聊天机器人在回答与眼部肿瘤学相关问题中的应用:DeepSeek v3、ChatGPT-4o和Gemini 2.0的比较研究

AI Chatbots in Answering Questions Related to Ocular Oncology: A Comparative Study Between DeepSeek v3, ChatGPT-4o, and Gemini 2.0.

作者信息

Das Deepsekhar, Narayan Atindra, Mishra Varsha, Takia Lalit, Grover Sumit, Bharati Avinav, Mb Shrijith

机构信息

Ophthalmology, All India Institute of Medical Sciences, New Delhi, New Delhi, IND.

Medicine, All India Institute of Medical Sciences, New Delhi, New Delhi, IND.

出版信息

Cureus. 2025 Aug 22;17(8):e90773. doi: 10.7759/cureus.90773. eCollection 2025 Aug.

DOI:10.7759/cureus.90773
PMID:40988843
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12450613/
Abstract

Background Artificial intelligence (AI) chatbots are increasingly used in healthcare for information dissemination and clinical decision support. However, their reliability and applicability in subspecialties such as ocular oncology remain largely unassessed. This study aimed to evaluate the accuracy, completeness, readability, and real-world utility of three prominent AI chatbots, ChatGPT-4o (OpenAI, San Francisco, California, USA), DeepSeek v3 (DeepSeek, Hangzhou, Zhejiang, China), and Gemini 2.0 (Google DeepMind, London, UK), in responding to clinically relevant questions related to ocular malignancies. Methods A cross-sectional observational study was conducted at a tertiary eye care institute in Northern India. Five clinical questions, covering key ocular oncologic conditions, were created and standardized by ocular oncology experts. These prompts were input into ChatGPT-4o, DeepSeek v3, and Gemini 2.0. Responses were independently evaluated using a structured proforma assessing correctness, completeness, readability (Flesch-Kincaid score, word count, sentence count), presence of irrelevant data, applicability in the Indian healthcare setting, and reliability. Data were analyzed using Kruskal-Wallis and ANOVA statistical tests. Results All three chatbots demonstrated comparable correctness scores (mean 3.4, SD 0.49). However, four out of five responses from each chatbot were deemed incomplete. DeepSeek v3 provided the most verbose and readable answers (mean 533.8 words; Flesch score 38.0), while ChatGPT-4o generated the shortest but more clinically reliable responses (mean reliability 3.2). Gemini 2.0 exhibited the greatest variability in length and structure. No irrelevant content was observed in any chatbot responses. Only 2/5 responses from ChatGPT-4o and 1/5 from each of the other two were directly applicable to Indian clinical practice. Conclusion While AI chatbots can offer factually accurate responses to ocular oncology-related queries, they often fall short in completeness and clinical applicability. ChatGPT-4o showed the most balanced performance, though regional customization and expert oversight remain essential. Current models are not yet suitable for unsupervised use in high-stakes clinical scenarios.

摘要

背景 人工智能(AI)聊天机器人在医疗保健领域越来越多地用于信息传播和临床决策支持。然而,它们在眼部肿瘤学等亚专业中的可靠性和适用性在很大程度上仍未得到评估。本研究旨在评估三款著名的人工智能聊天机器人ChatGPT-4o(美国加利福尼亚州旧金山OpenAI公司)、DeepSeek v3(中国浙江杭州DeepSeek公司)和Gemini 2.0(英国伦敦谷歌DeepMind公司)在回答与眼部恶性肿瘤相关的临床问题时的准确性、完整性、可读性和实际效用。方法 在印度北部的一家三级眼科护理机构进行了一项横断面观察性研究。由眼部肿瘤学专家创建并标准化了五个涵盖关键眼部肿瘤疾病的临床问题。这些提示被输入到ChatGPT-4o、DeepSeek v3和Gemini 2.0中。使用结构化表格独立评估回答的正确性、完整性、可读性(弗莱什-金凯德分数、单词数、句子数)、无关数据的存在、在印度医疗环境中的适用性和可靠性。使用克鲁斯卡尔-沃利斯检验和方差分析统计测试对数据进行分析。结果 三款聊天机器人的正确性得分相当(平均3.4,标准差0.49)。然而,每个聊天机器人的五个回答中有四个被认为不完整。DeepSeek v3提供的回答最冗长且可读性最强(平均533.8个单词;弗莱什分数38.0),而ChatGPT-4o生成的回答最短但临床可靠性更高(平均可靠性3.2)。Gemini 2.0在长度和结构上表现出最大的变异性。在任何聊天机器人的回答中均未观察到无关内容。ChatGPT-4o的五分之二回答以及其他两款聊天机器人各自五分之一的回答可直接应用于印度临床实践。结论 虽然人工智能聊天机器人可以对与眼部肿瘤学相关的问题提供事实准确的回答,但它们在完整性和临床适用性方面往往存在不足。ChatGPT-4o表现出最平衡的性能,不过区域定制和专家监督仍然至关重要。当前模型尚不适合在高风险临床场景中无监督使用。