Suppr超能文献

ChatGPT、Claude和Bard在支持近视防控方面的性能比较。

Comparison of the Performance of ChatGPT, Claude and Bard in Support of Myopia Prevention and Control.

作者信息

Wang Yan, Liang Lihua, Li Ran, Wang Yihua, Hao Changfu

机构信息

Department of Child and Adolescent Health, School of Public Health, Zhengzhou University, Zhengzhou, Henan, People's Republic of China.

Primary and Secondary School Health Center, Zhengzhou Education Science Planning and Evaluation Center, Zhengzhou Municipal Education Bureau, Zhengzhou, Henan, People's Republic of China.

出版信息

J Multidiscip Healthc. 2024 Aug 13;17:3917-3929. doi: 10.2147/JMDH.S473680. eCollection 2024.

Abstract

PURPOSE

Chatbots, which are based on large language models, are increasingly being used in public health. However, the effectiveness of chatbot responses has been debated, and their performance in myopia prevention and control has not been fully explored. This study aimed to evaluate the effectiveness of three well-known chatbots-ChatGPT, Claude, and Bard-in responding to public health questions about myopia.

METHODS

Nineteen public health questions about myopia (including three topics of policy, basics and measures) were responded individually by three chatbots. After shuffling the order, each chatbot response was independently rated by 4 raters for comprehensiveness, accuracy and relevance.

RESULTS

The study's questions have undergone reliable testing. There was a significant difference among the word count responses of all 3 chatbots. From most to least, the order was ChatGPT, Bard, and Claude. All 3 chatbots had a composite score above 4 out of 5. ChatGPT scored the highest in all aspects of the assessment. However, all chatbots exhibit shortcomings, such as giving fabricated responses.

CONCLUSION

Chatbots have shown great potential in public health, with ChatGPT being the best. The future use of chatbots as a public health tool will require rapid development of standards for their use and monitoring, as well as continued research, evaluation and improvement of chatbots.

摘要

目的

基于大语言模型的聊天机器人在公共卫生领域的应用日益广泛。然而,聊天机器人回复的有效性一直存在争议,其在近视防控方面的表现尚未得到充分探索。本研究旨在评估三款知名聊天机器人——ChatGPT、Claude和Bard——在回答有关近视的公共卫生问题时的有效性。

方法

三款聊天机器人分别对19个关于近视的公共卫生问题(包括政策、基础知识和措施三个主题)进行单独回复。打乱顺序后,由4名评分者对每个聊天机器人的回复进行独立评分,评估其全面性、准确性和相关性。

结果

研究中的问题经过了可靠性测试。三款聊天机器人的回复字数存在显著差异。从多到少依次为ChatGPT、Bard和Claude。三款聊天机器人的综合得分均高于5分制中的4分。ChatGPT在评估的各个方面得分最高。然而,所有聊天机器人都存在缺陷,比如给出虚假回复。

结论

聊天机器人在公共卫生领域展现出了巨大潜力,ChatGPT表现最佳。未来将聊天机器人用作公共卫生工具,需要迅速制定其使用和监测标准,并持续对聊天机器人进行研究、评估和改进。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c1b/11330241/5b7e06c38877/JMDH-17-3917-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验