• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估人工智能聊天机器人提供的心血管疾病信息质量:一项比较研究。

Evaluating the Quality of Cardiovascular Disease Information From AI Chatbots: A Comparative Study.

作者信息

Singavarapu Joshua, Khemlani Amber, Jacobs Menachem, Berglas Eli, Lazar Jason, Kabarriti Abdo

机构信息

Cardiology, State University of New York Downstate Health Sciences University, Brooklyn, USA.

Urology, State University of New York Downstate Health Sciences University, Brooklyn, USA.

出版信息

Cureus. 2025 Jul 16;17(7):e88085. doi: 10.7759/cureus.88085. eCollection 2025 Jul.

DOI:10.7759/cureus.88085
PMID:40821349
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12356239/
Abstract

Artificial intelligence (AI) is increasingly being utilized as an informational resource, with chatbots attracting users for their ability to generate instantaneous responses. This study evaluates the understandability, actionability, readability, quality, and misinformation in medical information provided by four prominent chatbots - Bard, ChatGPT 3.5, Claude 2.0, and Perplexity - on three prevalent cardiovascular diseases (CVDs): myocardial infarctions, heart failure, and arrhythmias. These chatbots were used because of their popularity and high usage rates among chatbots. Using Google Trends, the top five U.S. search queries related to heart attack, arrhythmia, and heart failure from September 29, 2018, to September 29, 2023, were identified. The top five queries were chosen in relation to these topics because they accounted for over 80% of the public's searches related to these topics. The chatbot responses were blinded and analyzed by two evaluators using DISCERN for quality, Patient Education Materials Assessment Tool (PEMAT) for understandability and actionability, and Flesch-Kincaid scores for readability. Statistical tests included the Kruskal-Wallis test for DISCERN, the chi-square test for PEMAT, and one-way ANOVA for Flesch-Kincaid scores. Bard generated responses with a statistically lower Flesch-Kincaid reading score than the other chatbots. Bard and ChatGPT 3.5 provided more actionable responses. Among the CVD topics, "heart attack" yielded lower-grade-level responses and more actionable information compared to "arrhythmia" and "heart failure." This study is among the first to assess AI credibility in disseminating cardiovascular information. It highlights how acute pathologic events may prompt more actionable and accessible chatbot responses. As AI continues to evolve, collaboration among healthcare professionals, researchers, and developers is crucial to ensuring the safe and effective use of AI in patient education and public health.

摘要

人工智能(AI)越来越多地被用作一种信息资源,聊天机器人因其能够生成即时回复的能力而吸引用户。本研究评估了四个著名聊天机器人——Bard、ChatGPT 3.5、Claude 2.0和Perplexity——提供的关于三种常见心血管疾病(CVD):心肌梗死、心力衰竭和心律失常的医学信息的可理解性、可操作性、可读性、质量和错误信息。选择这些聊天机器人是因为它们在聊天机器人中很受欢迎且使用率很高。利用谷歌趋势,确定了2018年9月29日至2023年9月29日期间与心脏病发作、心律失常和心力衰竭相关的美国前五大搜索查询。选择这前五个查询是因为它们占了公众与这些主题相关搜索的80%以上。聊天机器人的回复由两名评估人员进行盲测,使用DISCERN评估质量,使用患者教育材料评估工具(PEMAT)评估可理解性和可操作性,使用弗莱什-金凯德分数评估可读性。统计测试包括用于DISCERN的克鲁斯卡尔-沃利斯检验、用于PEMAT的卡方检验以及用于弗莱什-金凯德分数的单因素方差分析。Bard生成的回复在统计学上的弗莱什-金凯德阅读分数低于其他聊天机器人。Bard和ChatGPT 3.5提供了更具可操作性的回复。在心血管疾病主题中,与“心律失常”和“心力衰竭”相比,“心脏病发作”产生的回复年级水平较低且可操作信息更多。本研究是首批评估人工智能在传播心血管信息方面可信度的研究之一。它强调了急性病理事件如何可能促使聊天机器人做出更具可操作性和可获取性的回复。随着人工智能的不断发展,医疗保健专业人员、研究人员和开发人员之间的合作对于确保人工智能在患者教育和公共卫生中的安全有效使用至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1459/12356239/88519ca9c338/cureus-0017-00000088085-i02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1459/12356239/993f15ea4130/cureus-0017-00000088085-i01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1459/12356239/88519ca9c338/cureus-0017-00000088085-i02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1459/12356239/993f15ea4130/cureus-0017-00000088085-i01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1459/12356239/88519ca9c338/cureus-0017-00000088085-i02.jpg

相似文献

1
Evaluating the Quality of Cardiovascular Disease Information From AI Chatbots: A Comparative Study.评估人工智能聊天机器人提供的心血管疾病信息质量:一项比较研究。
Cureus. 2025 Jul 16;17(7):e88085. doi: 10.7759/cureus.88085. eCollection 2025 Jul.
2
Quality of Information on Wilms Tumor From Artificial Intelligence Chatbots: What Are Your Patients and Their Families Reading?人工智能聊天机器人提供的肾母细胞瘤信息质量:你的患者及其家属在阅读什么?
Urology. 2025 Apr;198:130-134. doi: 10.1016/j.urology.2025.01.054. Epub 2025 Feb 4.
3
Artificial Intelligence in Peripheral Artery Disease Education: A Battle Between ChatGPT and Google Gemini.外周动脉疾病教育中的人工智能:ChatGPT与谷歌Gemini的较量
Cureus. 2025 Jun 1;17(6):e85174. doi: 10.7759/cureus.85174. eCollection 2025 Jun.
4
Evaluating the role of AI chatbots in patient education for abdominal aortic aneurysms: a comparison of ChatGPT and conventional resources.评估人工智能聊天机器人在腹主动脉瘤患者教育中的作用:ChatGPT与传统资源的比较
ANZ J Surg. 2025 Apr;95(4):784-788. doi: 10.1111/ans.70053. Epub 2025 Mar 5.
5
Assessing chatbots ability to produce leaflets on cataract surgery: Bing AI, chatGPT 3.5, chatGPT 4o, ChatSonic, Google Bard, Perplexity, and Pi.评估聊天机器人生成白内障手术宣传册的能力:必应人工智能、ChatGPT 3.5、ChatGPT 4、ChatSonic、谷歌巴德、Perplexity和Pi。
J Cataract Refract Surg. 2025 May 1;51(5):371-375. doi: 10.1097/j.jcrs.0000000000001622.
6
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
7
Evaluating AI chatbots in penis enhancement information: a comparative analysis of readability, reliability and quality.评估人工智能聊天机器人在阴茎增大信息方面的表现:可读性、可靠性和质量的比较分析。
Int J Impot Res. 2025 Jun 3. doi: 10.1038/s41443-025-01098-3.
8
The Reliability Gap: How Traditional Search Engines Outperform Artificial Intelligence (AI) Chatbots in Rosacea Public Health Information Quality.可靠性差距:传统搜索引擎在酒渣鼻公共卫生信息质量方面如何优于人工智能(AI)聊天机器人。
Cureus. 2025 Jun 22;17(6):e86543. doi: 10.7759/cureus.86543. eCollection 2025 Jun.
9
Evaluating the effectiveness of chatbots and traditional resources in patient education on dry eye disease.评估聊天机器人和传统资源在干眼症患者教育中的有效性。
Clin Exp Optom. 2025 Jun 26:1-5. doi: 10.1080/08164622.2025.2517750.
10
Evaluating the Performance of State-of-the-Art Artificial Intelligence Chatbots Based on the WHO Global Guidelines for the Prevention of Surgical Site Infection: Cross-Sectional Study.基于世界卫生组织预防手术部位感染全球指南评估最先进的人工智能聊天机器人的性能:横断面研究
J Med Internet Res. 2025 Jul 31;27:e75567. doi: 10.2196/75567.

本文引用的文献

1
Assessing the response quality and readability of chatbots in cardiovascular health, oncology, and psoriasis: A comparative study.评估心血管健康、肿瘤学和银屑病领域的聊天机器人的响应质量和可读性:一项比较研究。
Int J Med Inform. 2024 Oct;190:105562. doi: 10.1016/j.ijmedinf.2024.105562. Epub 2024 Jul 19.
2
Evaluation of Prompts to Simplify Cardiovascular Disease Information Generated Using a Large Language Model: Cross-Sectional Study.评估使用大型语言模型生成的心血管疾病信息提示词的简化效果:一项横断面研究。
J Med Internet Res. 2024 Apr 22;26:e55388. doi: 10.2196/55388.
3
The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives.
大语言模型在医疗应用方面的突破:1 年时间线与展望。
J Med Syst. 2024 Feb 17;48(1):22. doi: 10.1007/s10916-024-02045-3.
4
Artificial Intelligence-Powered Patient Education for Comprehensive and Individualized Understanding for Patients.人工智能助力患者教育,实现患者全面个性化理解。
Clin Gastroenterol Hepatol. 2024 Jul;22(7):1550-1551. doi: 10.1016/j.cgh.2023.10.027. Epub 2023 Nov 7.
5
Heart Failure Epidemiology and Outcomes Statistics: A Report of the Heart Failure Society of America.心力衰竭流行病学与结局统计:美国心力衰竭学会报告
J Card Fail. 2023 Oct;29(10):1412-1451. doi: 10.1016/j.cardfail.2023.07.006. Epub 2023 Sep 26.
6
Assessment of Artificial Intelligence Chatbot Responses to Top Searched Queries About Cancer.评估人工智能聊天机器人对癌症热门搜索查询的响应
JAMA Oncol. 2023 Oct 1;9(10):1437-1440. doi: 10.1001/jamaoncol.2023.2947.
7
Epidemiology of Geographic Disparities of Myocardial Infarction Among Older Adults in the United States: Analysis of 2000-2017 Medicare Data.美国老年人中心肌梗死地理差异的流行病学:对2000 - 2017年医疗保险数据的分析
Front Cardiovasc Med. 2021 Sep 9;8:707102. doi: 10.3389/fcvm.2021.707102. eCollection 2021.
8
Reliability of Google Trends: Analysis of the Limits and Potential of Web Infoveillance During COVID-19 Pandemic and for Future Research.谷歌趋势的可靠性:新冠疫情期间及未来研究中网络信息监测的局限性与潜力分析
Front Res Metr Anal. 2021 May 25;6:670226. doi: 10.3389/frma.2021.670226. eCollection 2021.
9
Artificial Intelligence-Based Conversational Agents for Chronic Conditions: Systematic Literature Review.基于人工智能的慢性病对话代理:系统文献综述。
J Med Internet Res. 2020 Sep 14;22(9):e20701. doi: 10.2196/20701.
10
Development of the Patient Education Materials Assessment Tool (PEMAT): a new measure of understandability and actionability for print and audiovisual patient information.患者教育材料评估工具(PEMAT)的开发:一种针对印刷和视听患者信息的可理解性和可操作性的新测量方法。
Patient Educ Couns. 2014 Sep;96(3):395-403. doi: 10.1016/j.pec.2014.05.027. Epub 2014 Jun 12.