Suppr超能文献

聊天机器人作为美容面部整形手术的患者教育资源:对 ChatGPT 和 Google Bard 回复的评估。

Chatbots as Patient Education Resources for Aesthetic Facial Plastic Surgery: Evaluation of ChatGPT and Google Bard Responses.

机构信息

Department of Otolaryngology - Head and Neck Surgery, Thomas Jefferson University Hospitals, Philadelphia, Pennsylvania, USA.

Sidney Kimmel Medical College, Philadelphia, Pennsylvania, USA.

出版信息

Facial Plast Surg Aesthet Med. 2024 Nov-Dec;26(6):665-673. doi: 10.1089/fpsam.2023.0368. Epub 2024 Jul 1.

Abstract

ChatGPT and Google Bard™ are popular artificial intelligence chatbots with utility for patients, including those undergoing aesthetic facial plastic surgery. To compare the accuracy and readability of chatbot-generated responses to patient education questions regarding aesthetic facial plastic surgery using a response accuracy scale and readability testing. ChatGPT and Google Bard™ were asked 28 identical questions using four prompts: none, patient friendly, eighth-grade level, and references. Accuracy was assessed using Global Quality Scale (range: 1-5). Flesch-Kincaid grade level was calculated, and chatbot-provided references were analyzed for veracity. Although 59.8% of responses were good quality (Global Quality Scale ≥4), ChatGPT generated more accurate responses than Google Bard™ on patient-friendly prompting ( < 0.001). Google Bard™ responses were of a significantly lower grade level than ChatGPT for all prompts ( < 0.05). Despite eighth-grade prompting, response grade level for both chatbots was high: ChatGPT (10.5 ± 1.8) and Google Bard™ (9.6 ± 1.3). Prompting for references yielded 108/108 of chatbot-generated references. Forty-one (38.0%) citations were legitimate. Twenty (18.5%) provided accurately reported information from the reference. Although ChatGPT produced more accurate responses and at a higher education level than Google Bard™, both chatbots provided responses above recommended grade levels for patients and failed to provide accurate references.

摘要

ChatGPT 和 Google Bard™ 是广受欢迎的人工智能聊天机器人,对患者具有实用价值,包括那些正在接受美容面部整形手术的患者。为了比较聊天机器人对美容面部整形手术患者教育问题生成的回复的准确性和可读性,使用回复准确性量表和可读性测试。使用四个提示词(无提示、患者友好型、八年级水平和参考文献)向 ChatGPT 和 Google Bard™ 询问了 28 个相同的问题。准确性使用全球质量量表(范围:1-5)进行评估。计算弗莱什-金凯德年级水平,并分析聊天机器人提供的参考文献的真实性。尽管 59.8%的回复质量良好(全球质量量表≥4),但在患者友好型提示下,ChatGPT 生成的回复比 Google Bard™更准确(<0.001)。对于所有提示词,Google Bard™的回复等级都明显低于 ChatGPT(<0.05)。尽管提示词为八年级水平,但两个聊天机器人的回复等级都很高:ChatGPT(10.5±1.8)和 Google Bard™(9.6±1.3)。提示参考文献生成了 108/108 个聊天机器人生成的参考文献。41 个(38.0%)引述是合法的。20 个(18.5%)提供了准确报告的参考文献信息。尽管 ChatGPT 生成的回复比 Google Bard™更准确,且教育水平更高,但两个聊天机器人提供的回复都高于患者推荐的等级水平,且未能提供准确的参考文献。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验