ChatGPT、Claude和Bard在支持近视防控方面的性能比较。

Comparison of the Performance of ChatGPT, Claude and Bard in Support of Myopia Prevention and Control.

作者信息

Wang Yan, Liang Lihua, Li Ran, Wang Yihua, Hao Changfu

机构信息

Department of Child and Adolescent Health, School of Public Health, Zhengzhou University, Zhengzhou, Henan, People's Republic of China.

Primary and Secondary School Health Center, Zhengzhou Education Science Planning and Evaluation Center, Zhengzhou Municipal Education Bureau, Zhengzhou, Henan, People's Republic of China.

出版信息

J Multidiscip Healthc. 2024 Aug 13;17:3917-3929. doi: 10.2147/JMDH.S473680. eCollection 2024.

DOI:10.2147/JMDH.S473680

PMID:39155977

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11330241/

Abstract

PURPOSE

Chatbots, which are based on large language models, are increasingly being used in public health. However, the effectiveness of chatbot responses has been debated, and their performance in myopia prevention and control has not been fully explored. This study aimed to evaluate the effectiveness of three well-known chatbots-ChatGPT, Claude, and Bard-in responding to public health questions about myopia.

METHODS

Nineteen public health questions about myopia (including three topics of policy, basics and measures) were responded individually by three chatbots. After shuffling the order, each chatbot response was independently rated by 4 raters for comprehensiveness, accuracy and relevance.

RESULTS

The study's questions have undergone reliable testing. There was a significant difference among the word count responses of all 3 chatbots. From most to least, the order was ChatGPT, Bard, and Claude. All 3 chatbots had a composite score above 4 out of 5. ChatGPT scored the highest in all aspects of the assessment. However, all chatbots exhibit shortcomings, such as giving fabricated responses.

CONCLUSION

Chatbots have shown great potential in public health, with ChatGPT being the best. The future use of chatbots as a public health tool will require rapid development of standards for their use and monitoring, as well as continued research, evaluation and improvement of chatbots.

摘要

目的

基于大语言模型的聊天机器人在公共卫生领域的应用日益广泛。然而，聊天机器人回复的有效性一直存在争议，其在近视防控方面的表现尚未得到充分探索。本研究旨在评估三款知名聊天机器人——ChatGPT、Claude和Bard——在回答有关近视的公共卫生问题时的有效性。

方法

三款聊天机器人分别对19个关于近视的公共卫生问题（包括政策、基础知识和措施三个主题）进行单独回复。打乱顺序后，由4名评分者对每个聊天机器人的回复进行独立评分，评估其全面性、准确性和相关性。

结果

研究中的问题经过了可靠性测试。三款聊天机器人的回复字数存在显著差异。从多到少依次为ChatGPT、Bard和Claude。三款聊天机器人的综合得分均高于5分制中的4分。ChatGPT在评估的各个方面得分最高。然而，所有聊天机器人都存在缺陷，比如给出虚假回复。

结论

聊天机器人在公共卫生领域展现出了巨大潜力，ChatGPT表现最佳。未来将聊天机器人用作公共卫生工具，需要迅速制定其使用和监测标准，并持续对聊天机器人进行研究、评估和改进。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c1b/11330241/5b7e06c38877/JMDH-17-3917-g0001.jpg

相似文献

Comparison of the Performance of ChatGPT, Claude and Bard in Support of Myopia Prevention and Control.ChatGPT、Claude和Bard在支持近视防控方面的性能比较。

J Multidiscip Healthc. 2024 Aug 13;17:3917-3929. doi: 10.2147/JMDH.S473680. eCollection 2024.

Performance of Artificial Intelligence Chatbots on Glaucoma Questions Adapted From Patient Brochures.人工智能聊天机器人对改编自患者手册的青光眼问题的回答情况。

Cureus. 2024 Mar 23;16(3):e56766. doi: 10.7759/cureus.56766. eCollection 2024 Mar.

Accuracy and Readability of Artificial Intelligence Chatbot Responses to Vasectomy-Related Questions: Public Beware.人工智能聊天机器人对输精管切除术相关问题回答的准确性和可读性：公众需谨慎。

Cureus. 2024 Aug 28;16(8):e67996. doi: 10.7759/cureus.67996. eCollection 2024 Aug.

Benchmarking large language models' performances for myopia care: a comparative analysis of ChatGPT-3.5, ChatGPT-4.0, and Google Bard.比较分析 ChatGPT-3.5、ChatGPT-4.0 和谷歌巴德在近视防控方面的表现：大型语言模型的基准测试。

EBioMedicine. 2023 Sep;95:104770. doi: 10.1016/j.ebiom.2023.104770. Epub 2023 Aug 23.

Performance of ChatGPT-4 and Bard chatbots in responding to common patient questions on prostate cancer Lu-PSMA-617 therapy.ChatGPT-4和Bard聊天机器人在回答关于前列腺癌Lu-PSMA-617疗法常见患者问题方面的表现

Front Oncol. 2024 Jul 12;14:1386718. doi: 10.3389/fonc.2024.1386718. eCollection 2024.

Comparison of the Audiological Knowledge of Three Chatbots: ChatGPT, Bing Chat, and Bard.三款聊天机器人的听力学知识比较：ChatGPT、必应聊天和巴德

Audiol Neurootol. 2024;29(6):457-463. doi: 10.1159/000538983. Epub 2024 May 6.

A Comparative Analysis of Responses of Artificial Intelligence Chatbots in Special Needs Dentistry.人工智能聊天机器人在特殊需求牙科中的反应比较分析。

Pediatr Dent. 2024 Sep 15;46(5):337-344.

The performance of artificial intelligence chatbot large language models to address skeletal biology and bone health queries.人工智能聊天机器人大型语言模型在解决骨骼生物学和骨骼健康问题方面的表现。

J Bone Miner Res. 2024 Mar 22;39(2):106-115. doi: 10.1093/jbmr/zjad007.

The performance of large language model-powered chatbots compared to oncology physicians on colorectal cancer queries.与肿瘤内科医生相比，大型语言模型驱动的聊天机器人在结直肠癌相关问题上的表现。

Int J Surg. 2024 Oct 1;110(10):6509-6517. doi: 10.1097/JS9.0000000000001850.

Assessing the Accuracy of Information on Medication Abortion: A Comparative Analysis of ChatGPT and Google Bard AI.评估药物流产信息的准确性：ChatGPT与谷歌巴德人工智能的比较分析

Cureus. 2024 Jan 2;16(1):e51544. doi: 10.7759/cureus.51544. eCollection 2024 Jan.

引用本文的文献

Large language models in the management of chronic ocular diseases: a scoping review.大语言模型在慢性眼病管理中的应用：一项范围综述

Front Cell Dev Biol. 2025 Jun 18;13:1608988. doi: 10.3389/fcell.2025.1608988. eCollection 2025.

Assessing large language models as assistive tools in medical consultations for Kawasaki disease.评估大型语言模型作为川崎病医疗咨询辅助工具的作用。

Front Artif Intell. 2025 Mar 31;8:1571503. doi: 10.3389/frai.2025.1571503. eCollection 2025.

Generative AI and large language models in nuclear medicine: current status and future prospects.生成式人工智能和核医学中的大语言模型：现状与未来展望。

Ann Nucl Med. 2024 Nov;38(11):853-864. doi: 10.1007/s12149-024-01981-x. Epub 2024 Sep 25.

本文引用的文献

Utility of artificial intelligence-based large language models in ophthalmic care.人工智能大型语言模型在眼科护理中的应用。

Ophthalmic Physiol Opt. 2024 May;44(3):641-671. doi: 10.1111/opo.13284. Epub 2024 Feb 25.

Evaluating Large Language Models for the National Premedical Exam in India: Comparative Analysis of GPT-3.5, GPT-4, and Bard.评估印度全国医预考用大型语言模型：GPT-3.5、GPT-4 和 Bard 的比较分析。

JMIR Med Educ. 2024 Feb 21;10:e51523. doi: 10.2196/51523.

The influence of the environment and lifestyle on myopia.环境和生活方式对近视的影响。

J Physiol Anthropol. 2024 Jan 31;43(1):7. doi: 10.1186/s40101-024-00354-7.

[Rare disease in the age of artificial intelligence.].[人工智能时代的罕见病。]

Recenti Prog Med. 2024 Feb;115(2):67-75. doi: 10.1701/4197.41839.

Large language models and rheumatology: a comparative evaluation.大语言模型与风湿病学：一项比较评估

Lancet Rheumatol. 2023 Oct;5(10):e574-e578. doi: 10.1016/S2665-9913(23)00216-3. Epub 2023 Sep 25.

Information amount, accuracy, and relevance of generative artificial intelligence platforms’ answers regarding learning objectives of medical arthropodology evaluated in English and Korean queries in December 2023: a descriptive study.2023 年 12 月对生成式人工智能平台回答英语和韩语查询中有关医学节肢动物学学习目标的信息量、准确性和相关性的评估：描述性研究。

J Educ Eval Health Prof. 2023;20:39. doi: 10.3352/jeehp.2023.20.39. Epub 2023 Dec 28.

Performance of ChatGPT, Bard, Claude, and Bing on the Peruvian National Licensing Medical Examination: a cross-sectional study.ChatGPT、Bard、Claude 和 Bing 在秘鲁国家医师执照考试中的表现：一项横断面研究。

J Educ Eval Health Prof. 2023;20:30. doi: 10.3352/jeehp.2023.20.30. Epub 2023 Nov 20.

Performance of Google bard and ChatGPT in mass casualty incidents triage.谷歌巴德和 ChatGPT 在大规模伤亡事件分诊中的表现。

Am J Emerg Med. 2024 Jan;75:72-78. doi: 10.1016/j.ajem.2023.10.034. Epub 2023 Oct 29.

Large Language Models for Therapy Recommendations Across 3 Clinical Specialties: Comparative Study.大型语言模型在 3 个临床专业领域的治疗推荐中的应用：比较研究。

J Med Internet Res. 2023 Oct 30;25:e49324. doi: 10.2196/49324.

Harnessing large language models (LLMs) for candidate gene prioritization and selection.利用大型语言模型（LLMs）进行候选基因优先级排序和选择。

J Transl Med. 2023 Oct 16;21(1):728. doi: 10.1186/s12967-023-04576-8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

ChatGPT、Claude和Bard在支持近视防控方面的性能比较。

Comparison of the Performance of ChatGPT, Claude and Bard in Support of Myopia Prevention and Control.

作者信息

机构信息

出版信息

PURPOSE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献