文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

评估人工智能聊天机器人对心肺复苏术 100 个最常见查询的回答的易读性、可靠性和质量:一项观察性研究。

Assessing the readability, reliability, and quality of artificial intelligence chatbot responses to the 100 most searched queries about cardiopulmonary resuscitation: An observational study.

机构信息

Department of Anesthesiology and Reanimation, School of Medicine, Dokuz Eylul University, Izmir, Turkey.

Departments of Faculty of Engineering, Ostim Technical University, Artificial Intelligence Engineering, Ankara, Turkey.

出版信息

Medicine (Baltimore). 2024 May 31;103(22):e38352. doi: 10.1097/MD.0000000000038352.


DOI:10.1097/MD.0000000000038352
PMID:39259094
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11142831/
Abstract

This study aimed to evaluate the readability, reliability, and quality of responses by 4 selected artificial intelligence (AI)-based large language model (LLM) chatbots to questions related to cardiopulmonary resuscitation (CPR). This was a cross-sectional study. Responses to the 100 most frequently asked questions about CPR by 4 selected chatbots (ChatGPT-3.5 [Open AI], Google Bard [Google AI], Google Gemini [Google AI], and Perplexity [Perplexity AI]) were analyzed for readability, reliability, and quality. The chatbots were asked the following question: "What are the 100 most frequently asked questions about cardio pulmonary resuscitation?" in English. Each of the 100 queries derived from the responses was individually posed to the 4 chatbots. The 400 responses or patient education materials (PEM) from the chatbots were assessed for quality and reliability using the modified DISCERN Questionnaire, Journal of the American Medical Association and Global Quality Score. Readability assessment utilized 2 different calculators, which computed readability scores independently using metrics such as Flesch Reading Ease Score, Flesch-Kincaid Grade Level, Simple Measure of Gobbledygook, Gunning Fog Readability and Automated Readability Index. Analyzed 100 responses from each of the 4 chatbots. When the readability values of the median results obtained from Calculators 1 and 2 were compared with the 6th-grade reading level, there was a highly significant difference between the groups (P < .001). Compared to all formulas, the readability level of the responses was above 6th grade. It can be seen that the order of readability from easy to difficult is Bard, Perplexity, Gemini, and ChatGPT-3.5. The readability of the text content provided by all 4 chatbots was found to be above the 6th-grade level. We believe that enhancing the quality, reliability, and readability of PEMs will lead to easier understanding by readers and more accurate performance of CPR. So, patients who receive bystander CPR may experience an increased likelihood of survival.

摘要

本研究旨在评估 4 种选定的基于人工智能 (AI) 的大型语言模型 (LLM) 聊天机器人对心肺复苏术 (CPR) 相关问题的回答的可读性、可靠性和质量。这是一项横断面研究。分析了 4 种选定的聊天机器人(Open AI 的 ChatGPT-3.5、Google AI 的 Bard、Google AI 的 Gemini 和 Perplexity AI 的 Perplexity)对 100 个最常问到的关于心肺复苏术问题的回答的可读性、可靠性和质量。以英语向聊天机器人提出以下问题:“心肺复苏术最常问到的 100 个问题是什么?”从每个聊天机器人的回答中提取出 100 个查询,并将其逐个向 4 种聊天机器人提出。使用修改后的 DISCERN 问卷、《美国医学会杂志》和全球质量评分对来自聊天机器人的 400 个回复或患者教育材料 (PEM) 进行质量和可靠性评估。使用 2 种不同的计算器进行可读性评估,这些计算器使用 Flesch 阅读容易度评分、Flesch-Kincaid 年级水平、简单的胡言乱语测量、Gunning Fog 阅读和自动可读性指数等指标独立计算可读性得分。分析了来自 4 种聊天机器人的每个机器人的 100 个回复。当将计算器 1 和 2 获得的中位数结果的可读性值与 6 年级阅读水平进行比较时,各组之间存在非常显著的差异(P<0.001)。与所有公式相比,回复的可读性水平都高于 6 年级。可以看出,从易到难的可读性顺序是 Bard、Perplexity、Gemini 和 ChatGPT-3.5。所有 4 种聊天机器人提供的文本内容的可读性都高于 6 年级水平。我们相信,提高 PEM 的质量、可靠性和可读性将使读者更容易理解,并更准确地进行 CPR。因此,接受旁观者 CPR 的患者可能会增加生存的可能性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d1c/11142831/b96303c90b97/medi-103-e38352-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d1c/11142831/b96303c90b97/medi-103-e38352-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d1c/11142831/b96303c90b97/medi-103-e38352-g001.jpg

相似文献

[1]
Assessing the readability, reliability, and quality of artificial intelligence chatbot responses to the 100 most searched queries about cardiopulmonary resuscitation: An observational study.

Medicine (Baltimore). 2024-5-31

[2]
Assessment of readability, reliability, and quality of ChatGPT®, BARD®, Gemini®, Copilot®, Perplexity® responses on palliative care.

Medicine (Baltimore). 2024-8-16

[3]
Readability, reliability and quality of responses generated by ChatGPT, gemini, and perplexity for the most frequently asked questions about pain.

Medicine (Baltimore). 2025-3-14

[4]
Assessing the quality and readability of patient education materials on chemotherapy cardiotoxicity from artificial intelligence chatbots: An observational cross-sectional study.

Medicine (Baltimore). 2025-4-11

[5]
Assessing the readability, quality and reliability of responses produced by ChatGPT, Gemini, and Perplexity regarding most frequently asked keywords about low back pain.

PeerJ. 2025-1-22

[6]
Assessing the Readability of Patient Education Materials on Cardiac Catheterization From Artificial Intelligence Chatbots: An Observational Cross-Sectional Study.

Cureus. 2024-7-4

[7]
How artificial intelligence can provide information about subdural hematoma: Assessment of readability, reliability, and quality of ChatGPT, BARD, and perplexity responses.

Medicine (Baltimore). 2024-5-3

[8]
Accuracy and Readability of Artificial Intelligence Chatbot Responses to Vasectomy-Related Questions: Public Beware.

Cureus. 2024-8-28

[9]
AI Chatbots as Sources of STD Information: A Study on Reliability and Readability.

J Med Syst. 2025-4-3

[10]
Appropriateness and readability of Google Bard and ChatGPT-3.5 generated responses for surgical treatment of glaucoma.

Rom J Ophthalmol. 2024

引用本文的文献

[1]
Comparison of the readability of ChatGPT and Bard in medical communication: a meta-analysis.

BMC Med Inform Decis Mak. 2025-9-1

[2]
Evaluating large language models in patient education on facial plastic surgery: a standardized protocol.

Int J Surg Protoc. 2025-6-11

[3]
Evaluation of deepseek, gemini, ChatGPT-4o, and perplexity in responding to salivary gland cancer.

BMC Oral Health. 2025-8-23

[4]
Evaluating the readability, quality, and reliability of responses generated by ChatGPT, Gemini, and Perplexity on the most commonly asked questions about Ankylosing spondylitis.

PLoS One. 2025-6-18

[5]
Comparative analysis of language models in addressing syphilis-related queries.

Med Oral Patol Oral Cir Bucal. 2025-7-1

[6]
Assessing the quality and readability of patient education materials on chemotherapy cardiotoxicity from artificial intelligence chatbots: An observational cross-sectional study.

Medicine (Baltimore). 2025-4-11

[7]
Readability, reliability and quality of responses generated by ChatGPT, gemini, and perplexity for the most frequently asked questions about pain.

Medicine (Baltimore). 2025-3-14

[8]
Assessing the readability, quality and reliability of responses produced by ChatGPT, Gemini, and Perplexity regarding most frequently asked keywords about low back pain.

PeerJ. 2025-1-22

本文引用的文献

[1]
Large language models as assistance for glaucoma surgical cases: a ChatGPT vs. Google Gemini comparison.

Graefes Arch Clin Exp Ophthalmol. 2024-9

[2]
The Temperature Feature of ChatGPT: Modifying Creativity for Clinical Research.

JMIR Hum Factors. 2024-3-8

[3]
Exploring AI-chatbots' capability to suggest surgical planning in ophthalmology: ChatGPT versus Google Gemini analysis of retinal detachment cases.

Br J Ophthalmol. 2024-9-20

[4]
Both Patients and Plastic Surgeons Prefer Artificial Intelligence-Generated Microsurgical Information.

J Reconstr Microsurg. 2024-11

[5]
Google DeepMind's gemini AI versus ChatGPT: a comparative analysis in ophthalmology.

Eye (Lond). 2024-6

[6]
Assessing the accuracy, usefulness, and readability of artificial-intelligence-generated responses to common dermatologic surgery questions for patient education: A double-blinded comparative study of ChatGPT and Google Bard.

J Am Acad Dermatol. 2024-5

[7]
Talking technology: exploring chatbots as a tool for cataract patient education.

Clin Exp Optom. 2025-1

[8]
The Quality of CLP-Related Information for Patients Provided by ChatGPT.

Cleft Palate Craniofac J. 2025-4

[9]
ChatGPT Performance in Diagnostic Clinical Microbiology Laboratory-Oriented Case Scenarios.

Cureus. 2023-12-16

[10]
Evaluation of Oropharyngeal Cancer Information from Revolutionary Artificial Intelligence Chatbot.

Laryngoscope. 2024-5

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索