• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ChatGPT、Claude和Bard在支持近视防控方面的性能比较。

Comparison of the Performance of ChatGPT, Claude and Bard in Support of Myopia Prevention and Control.

作者信息

Wang Yan, Liang Lihua, Li Ran, Wang Yihua, Hao Changfu

机构信息

Department of Child and Adolescent Health, School of Public Health, Zhengzhou University, Zhengzhou, Henan, People's Republic of China.

Primary and Secondary School Health Center, Zhengzhou Education Science Planning and Evaluation Center, Zhengzhou Municipal Education Bureau, Zhengzhou, Henan, People's Republic of China.

出版信息

J Multidiscip Healthc. 2024 Aug 13;17:3917-3929. doi: 10.2147/JMDH.S473680. eCollection 2024.

DOI:10.2147/JMDH.S473680
PMID:39155977
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11330241/
Abstract

PURPOSE

Chatbots, which are based on large language models, are increasingly being used in public health. However, the effectiveness of chatbot responses has been debated, and their performance in myopia prevention and control has not been fully explored. This study aimed to evaluate the effectiveness of three well-known chatbots-ChatGPT, Claude, and Bard-in responding to public health questions about myopia.

METHODS

Nineteen public health questions about myopia (including three topics of policy, basics and measures) were responded individually by three chatbots. After shuffling the order, each chatbot response was independently rated by 4 raters for comprehensiveness, accuracy and relevance.

RESULTS

The study's questions have undergone reliable testing. There was a significant difference among the word count responses of all 3 chatbots. From most to least, the order was ChatGPT, Bard, and Claude. All 3 chatbots had a composite score above 4 out of 5. ChatGPT scored the highest in all aspects of the assessment. However, all chatbots exhibit shortcomings, such as giving fabricated responses.

CONCLUSION

Chatbots have shown great potential in public health, with ChatGPT being the best. The future use of chatbots as a public health tool will require rapid development of standards for their use and monitoring, as well as continued research, evaluation and improvement of chatbots.

摘要

目的

基于大语言模型的聊天机器人在公共卫生领域的应用日益广泛。然而,聊天机器人回复的有效性一直存在争议,其在近视防控方面的表现尚未得到充分探索。本研究旨在评估三款知名聊天机器人——ChatGPT、Claude和Bard——在回答有关近视的公共卫生问题时的有效性。

方法

三款聊天机器人分别对19个关于近视的公共卫生问题(包括政策、基础知识和措施三个主题)进行单独回复。打乱顺序后,由4名评分者对每个聊天机器人的回复进行独立评分,评估其全面性、准确性和相关性。

结果

研究中的问题经过了可靠性测试。三款聊天机器人的回复字数存在显著差异。从多到少依次为ChatGPT、Bard和Claude。三款聊天机器人的综合得分均高于5分制中的4分。ChatGPT在评估的各个方面得分最高。然而,所有聊天机器人都存在缺陷,比如给出虚假回复。

结论

聊天机器人在公共卫生领域展现出了巨大潜力,ChatGPT表现最佳。未来将聊天机器人用作公共卫生工具,需要迅速制定其使用和监测标准,并持续对聊天机器人进行研究、评估和改进。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c1b/11330241/98dbabec3a2b/JMDH-17-3917-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c1b/11330241/5b7e06c38877/JMDH-17-3917-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c1b/11330241/f96064f648c0/JMDH-17-3917-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c1b/11330241/98dbabec3a2b/JMDH-17-3917-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c1b/11330241/5b7e06c38877/JMDH-17-3917-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c1b/11330241/f96064f648c0/JMDH-17-3917-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c1b/11330241/98dbabec3a2b/JMDH-17-3917-g0003.jpg

相似文献

1
Comparison of the Performance of ChatGPT, Claude and Bard in Support of Myopia Prevention and Control.ChatGPT、Claude和Bard在支持近视防控方面的性能比较。
J Multidiscip Healthc. 2024 Aug 13;17:3917-3929. doi: 10.2147/JMDH.S473680. eCollection 2024.
2
Performance of Artificial Intelligence Chatbots on Glaucoma Questions Adapted From Patient Brochures.人工智能聊天机器人对改编自患者手册的青光眼问题的回答情况。
Cureus. 2024 Mar 23;16(3):e56766. doi: 10.7759/cureus.56766. eCollection 2024 Mar.
3
Accuracy and Readability of Artificial Intelligence Chatbot Responses to Vasectomy-Related Questions: Public Beware.人工智能聊天机器人对输精管切除术相关问题回答的准确性和可读性:公众需谨慎。
Cureus. 2024 Aug 28;16(8):e67996. doi: 10.7759/cureus.67996. eCollection 2024 Aug.
4
Benchmarking large language models' performances for myopia care: a comparative analysis of ChatGPT-3.5, ChatGPT-4.0, and Google Bard.比较分析 ChatGPT-3.5、ChatGPT-4.0 和谷歌巴德在近视防控方面的表现:大型语言模型的基准测试。
EBioMedicine. 2023 Sep;95:104770. doi: 10.1016/j.ebiom.2023.104770. Epub 2023 Aug 23.
5
Performance of ChatGPT-4 and Bard chatbots in responding to common patient questions on prostate cancer Lu-PSMA-617 therapy.ChatGPT-4和Bard聊天机器人在回答关于前列腺癌Lu-PSMA-617疗法常见患者问题方面的表现
Front Oncol. 2024 Jul 12;14:1386718. doi: 10.3389/fonc.2024.1386718. eCollection 2024.
6
Comparison of the Audiological Knowledge of Three Chatbots: ChatGPT, Bing Chat, and Bard.三款聊天机器人的听力学知识比较:ChatGPT、必应聊天和巴德
Audiol Neurootol. 2024;29(6):457-463. doi: 10.1159/000538983. Epub 2024 May 6.
7
A Comparative Analysis of Responses of Artificial Intelligence Chatbots in Special Needs Dentistry.人工智能聊天机器人在特殊需求牙科中的反应比较分析。
Pediatr Dent. 2024 Sep 15;46(5):337-344.
8
The performance of artificial intelligence chatbot large language models to address skeletal biology and bone health queries.人工智能聊天机器人大型语言模型在解决骨骼生物学和骨骼健康问题方面的表现。
J Bone Miner Res. 2024 Mar 22;39(2):106-115. doi: 10.1093/jbmr/zjad007.
9
The performance of large language model-powered chatbots compared to oncology physicians on colorectal cancer queries.与肿瘤内科医生相比,大型语言模型驱动的聊天机器人在结直肠癌相关问题上的表现。
Int J Surg. 2024 Oct 1;110(10):6509-6517. doi: 10.1097/JS9.0000000000001850.
10
Assessing the Accuracy of Information on Medication Abortion: A Comparative Analysis of ChatGPT and Google Bard AI.评估药物流产信息的准确性:ChatGPT与谷歌巴德人工智能的比较分析
Cureus. 2024 Jan 2;16(1):e51544. doi: 10.7759/cureus.51544. eCollection 2024 Jan.

引用本文的文献

1
Large language models in the management of chronic ocular diseases: a scoping review.大语言模型在慢性眼病管理中的应用:一项范围综述
Front Cell Dev Biol. 2025 Jun 18;13:1608988. doi: 10.3389/fcell.2025.1608988. eCollection 2025.
2
Assessing large language models as assistive tools in medical consultations for Kawasaki disease.评估大型语言模型作为川崎病医疗咨询辅助工具的作用。
Front Artif Intell. 2025 Mar 31;8:1571503. doi: 10.3389/frai.2025.1571503. eCollection 2025.
3
Generative AI and large language models in nuclear medicine: current status and future prospects.

本文引用的文献

1
Utility of artificial intelligence-based large language models in ophthalmic care.人工智能大型语言模型在眼科护理中的应用。
Ophthalmic Physiol Opt. 2024 May;44(3):641-671. doi: 10.1111/opo.13284. Epub 2024 Feb 25.
2
Evaluating Large Language Models for the National Premedical Exam in India: Comparative Analysis of GPT-3.5, GPT-4, and Bard.评估印度全国医预考用大型语言模型:GPT-3.5、GPT-4 和 Bard 的比较分析。
JMIR Med Educ. 2024 Feb 21;10:e51523. doi: 10.2196/51523.
3
The influence of the environment and lifestyle on myopia.
生成式人工智能和核医学中的大语言模型:现状与未来展望。
Ann Nucl Med. 2024 Nov;38(11):853-864. doi: 10.1007/s12149-024-01981-x. Epub 2024 Sep 25.
环境和生活方式对近视的影响。
J Physiol Anthropol. 2024 Jan 31;43(1):7. doi: 10.1186/s40101-024-00354-7.
4
[Rare disease in the age of artificial intelligence.].[人工智能时代的罕见病。]
Recenti Prog Med. 2024 Feb;115(2):67-75. doi: 10.1701/4197.41839.
5
Large language models and rheumatology: a comparative evaluation.大语言模型与风湿病学:一项比较评估
Lancet Rheumatol. 2023 Oct;5(10):e574-e578. doi: 10.1016/S2665-9913(23)00216-3. Epub 2023 Sep 25.
6
Information amount, accuracy, and relevance of generative artificial intelligence platforms’ answers regarding learning objectives of medical arthropodology evaluated in English and Korean queries in December 2023: a descriptive study.2023 年 12 月对生成式人工智能平台回答英语和韩语查询中有关医学节肢动物学学习目标的信息量、准确性和相关性的评估:描述性研究。
J Educ Eval Health Prof. 2023;20:39. doi: 10.3352/jeehp.2023.20.39. Epub 2023 Dec 28.
7
Performance of ChatGPT, Bard, Claude, and Bing on the Peruvian National Licensing Medical Examination: a cross-sectional study.ChatGPT、Bard、Claude 和 Bing 在秘鲁国家医师执照考试中的表现:一项横断面研究。
J Educ Eval Health Prof. 2023;20:30. doi: 10.3352/jeehp.2023.20.30. Epub 2023 Nov 20.
8
Performance of Google bard and ChatGPT in mass casualty incidents triage.谷歌巴德和 ChatGPT 在大规模伤亡事件分诊中的表现。
Am J Emerg Med. 2024 Jan;75:72-78. doi: 10.1016/j.ajem.2023.10.034. Epub 2023 Oct 29.
9
Large Language Models for Therapy Recommendations Across 3 Clinical Specialties: Comparative Study.大型语言模型在 3 个临床专业领域的治疗推荐中的应用:比较研究。
J Med Internet Res. 2023 Oct 30;25:e49324. doi: 10.2196/49324.
10
Harnessing large language models (LLMs) for candidate gene prioritization and selection.利用大型语言模型(LLMs)进行候选基因优先级排序和选择。
J Transl Med. 2023 Oct 16;21(1):728. doi: 10.1186/s12967-023-04576-8.