• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于在线大语言模型的人工智能聊天平台在回答患者关于心力衰竭问题时的准确性和一致性。

Accuracy and consistency of online large language model-based artificial intelligence chat platforms in answering patients' questions about heart failure.

作者信息

Kozaily Elie, Geagea Mabelissa, Akdogan Ecem R, Atkins Jessica, Elshazly Mohamed B, Guglin Maya, Tedford Ryan J, Wehbe Ramsey M

机构信息

Division of Cardiology, Department of Medicine, Medical University of South Carolina, Charleston, SC, USA.

Division of Cardiology, Department of Medicine, Hotel-Dieu de France, Beirut, Lebanon.

出版信息

Int J Cardiol. 2024 Aug 1;408:132115. doi: 10.1016/j.ijcard.2024.132115. Epub 2024 Apr 30.

DOI:10.1016/j.ijcard.2024.132115
PMID:38697402
Abstract

BACKGROUND

Heart failure (HF) is a prevalent condition associated with significant morbidity. Patients may have questions that they feel embarrassed to ask or will face delays awaiting responses from their healthcare providers which may impact their health behavior. We aimed to investigate the potential of large language model (LLM) based artificial intelligence (AI) chat platforms in complementing the delivery of patient-centered care.

METHODS

Using online patient forums and physician experience, we created 30 questions related to diagnosis, management and prognosis of HF. The questions were posed to two LLM-based AI chat platforms (OpenAI's ChatGPT-3.5 and Google's Bard). Each set of answers was evaluated by two HF experts, independently and blinded to each other, for accuracy (adequacy of content) and consistency of content.

RESULTS

ChatGPT provided mostly appropriate answers (27/30, 90%) and showed a high degree of consistency (93%). Bard provided a similar content in its answers and thus was evaluated only for adequacy (23/30, 77%). The two HF experts' grades were concordant in 83% and 67% of the questions for ChatGPT and Bard, respectively.

CONCLUSION

LLM-based AI chat platforms demonstrate potential in improving HF education and empowering patients, however, these platforms currently suffer from issues related to factual errors and difficulty with more contemporary recommendations. This inaccurate information may pose serious and life-threatening implications for patients that should be considered and addressed in future research.

摘要

背景

心力衰竭(HF)是一种常见疾病,伴有严重的发病率。患者可能有一些问题,他们不好意思问,或者会面临等待医疗服务提供者回复的延迟,这可能会影响他们的健康行为。我们旨在研究基于大语言模型(LLM)的人工智能(AI)聊天平台在补充以患者为中心的护理方面的潜力。

方法

利用在线患者论坛和医生经验,我们创建了30个与HF的诊断、管理和预后相关的问题。这些问题被提交给两个基于LLM的AI聊天平台(OpenAI的ChatGPT-3.5和谷歌的Bard)。每组答案由两名HF专家独立评估,且彼此不知情,评估内容包括准确性(内容的充分性)和内容的一致性。

结果

ChatGPT提供的大多是恰当答案(27/30,90%),且显示出高度的一致性(93%)。Bard在其答案中提供了类似的内容,因此仅对充分性进行评估(23/30,77%)。对于ChatGPT和Bard的问题,两位HF专家的评分分别在83%和67%的问题上一致。

结论

基于LLM的AI聊天平台在改善HF教育和增强患者能力方面显示出潜力,然而,这些平台目前存在与事实错误以及难以提供更现代建议相关的问题。这种不准确的信息可能会给患者带来严重的、危及生命的影响,在未来的研究中应予以考虑和解决。

相似文献

1
Accuracy and consistency of online large language model-based artificial intelligence chat platforms in answering patients' questions about heart failure.基于在线大语言模型的人工智能聊天平台在回答患者关于心力衰竭问题时的准确性和一致性。
Int J Cardiol. 2024 Aug 1;408:132115. doi: 10.1016/j.ijcard.2024.132115. Epub 2024 Apr 30.
2
Evidence-based potential of generative artificial intelligence large language models in orthodontics: a comparative study of ChatGPT, Google Bard, and Microsoft Bing.生成式人工智能大语言模型在正畸学中的循证潜力:ChatGPT、谷歌巴德和微软必应的比较研究
Eur J Orthod. 2024 Apr 13. doi: 10.1093/ejo/cjae017.
3
The performance of artificial intelligence chatbot large language models to address skeletal biology and bone health queries.人工智能聊天机器人大型语言模型在解决骨骼生物学和骨骼健康问题方面的表现。
J Bone Miner Res. 2024 Mar 22;39(2):106-115. doi: 10.1093/jbmr/zjad007.
4
Chat Generative Pretrained Transformer (ChatGPT) and Bard: Artificial Intelligence Does not yet Provide Clinically Supported Answers for Hip and Knee Osteoarthritis.聊天生成预训练转换器(ChatGPT)和巴德:人工智能尚未为髋和膝关节骨关节炎提供临床支持的答案。
J Arthroplasty. 2024 May;39(5):1184-1190. doi: 10.1016/j.arth.2024.01.029. Epub 2024 Jan 17.
5
Performance of artificial intelligence in bariatric surgery: comparative analysis of ChatGPT-4, Bing, and Bard in the American Society for Metabolic and Bariatric Surgery textbook of bariatric surgery questions.人工智能在减重手术中的表现:ChatGPT-4、Bing 和 Bard 在《美国代谢与减重外科学会减重手术教科书》减重手术问题中的比较分析。
Surg Obes Relat Dis. 2024 Jul;20(7):609-613. doi: 10.1016/j.soard.2024.04.014. Epub 2024 May 8.
6
Comparison of artificial intelligence large language model chatbots in answering frequently asked questions in anaesthesia.人工智能大语言模型聊天机器人在回答麻醉常见问题方面的比较。
BJA Open. 2024 May 8;10:100280. doi: 10.1016/j.bjao.2024.100280. eCollection 2024 Jun.
7
Evaluation of the Performance of Generative AI Large Language Models ChatGPT, Google Bard, and Microsoft Bing Chat in Supporting Evidence-Based Dentistry: Comparative Mixed Methods Study.评估生成式 AI 大语言模型 ChatGPT、Google Bard 和 Microsoft Bing Chat 在支持循证牙科方面的性能:比较混合方法研究。
J Med Internet Res. 2023 Dec 28;25:e51580. doi: 10.2196/51580.
8
Generative artificial intelligence chatbots may provide appropriate informational responses to common vascular surgery questions by patients.生成式人工智能聊天机器人可能会为患者关于常见血管外科问题提供恰当的信息性回复。
Vascular. 2025 Feb;33(1):229-237. doi: 10.1177/17085381241240550. Epub 2024 Mar 18.
9
Assessing the Accuracy of Information on Medication Abortion: A Comparative Analysis of ChatGPT and Google Bard AI.评估药物流产信息的准确性:ChatGPT与谷歌巴德人工智能的比较分析
Cureus. 2024 Jan 2;16(1):e51544. doi: 10.7759/cureus.51544. eCollection 2024 Jan.
10
Efficacy of AI Chats to Determine an Emergency: A Comparison Between OpenAI's ChatGPT, Google Bard, and Microsoft Bing AI Chat.人工智能聊天工具在判定紧急情况方面的效能:OpenAI的ChatGPT、谷歌巴德和微软必应人工智能聊天工具的比较
Cureus. 2023 Sep 18;15(9):e45473. doi: 10.7759/cureus.45473. eCollection 2023 Sep.

引用本文的文献

1
Evaluating ChatGPT's Accuracy and Readability in Responding to Common Ophthalmology Questions.评估ChatGPT在回答常见眼科问题时的准确性和可读性。
Cureus. 2025 Jul 14;17(7):e87920. doi: 10.7759/cureus.87920. eCollection 2025 Jul.
2
Utilizing multimodal artificial intelligence to advance cardiovascular diseases.利用多模态人工智能推进心血管疾病研究。
Precis Clin Med. 2025 Jul 17;8(3):pbaf016. doi: 10.1093/pcmedi/pbaf016. eCollection 2025 Sep.
3
Reliability of large language models for reviewing research with artificial intelligence in cardiac electrophysiology using the European Heart Rhythm Association artificial intelligence checklist.
使用欧洲心律协会人工智能检查表,大型语言模型对心脏电生理领域人工智能辅助研究综述的可靠性。
Europace. 2025 Aug 4;27(8). doi: 10.1093/europace/euaf173.
4
Artificial Intelligence Large Language Models in Cardiology.心脏病学中的人工智能大语言模型
Rev Cardiovasc Med. 2025 Jul 8;26(7):39452. doi: 10.31083/RCM39452. eCollection 2025 Jul.
5
Applications of large language models in cardiovascular disease: a systematic review.大语言模型在心血管疾病中的应用:一项系统综述
Eur Heart J Digit Health. 2025 Apr 1;6(4):540-553. doi: 10.1093/ehjdh/ztaf028. eCollection 2025 Jul.
6
Evaluation of artificial intelligence (AI) chatbots for providing sexual health information: a consensus study using real-world clinical queries.评估用于提供性健康信息的人工智能(AI)聊天机器人:一项使用真实临床问题的共识研究。
BMC Public Health. 2025 May 15;25(1):1788. doi: 10.1186/s12889-025-22933-8.
7
Accuracy of Large Language Models When Answering Clinical Research Questions: Systematic Review and Network Meta-Analysis.大型语言模型回答临床研究问题的准确性:系统评价与网络荟萃分析
J Med Internet Res. 2025 Apr 30;27:e64486. doi: 10.2196/64486.
8
Evaluation of Information About Cardiovascular Implications of Gender-Affirming Care From Online Chat-based Artificial Intelligence Systems.基于在线聊天的人工智能系统对性别肯定治疗的心血管影响信息的评估
CJC Open. 2024 Nov 30;7(3):338-343. doi: 10.1016/j.cjco.2024.11.020. eCollection 2025 Mar.
9
Assessing online chat-based artificial intelligence models for weight loss recommendation appropriateness and bias in the presence of guideline incongruence.在存在指南不一致的情况下,评估基于在线聊天的人工智能模型在减肥建议方面的适当性和偏差。
Int J Obes (Lond). 2025 Jan 27. doi: 10.1038/s41366-025-01717-5.
10
Charting the future of cardiology with large language model artificial intelligence.用大语言模型人工智能描绘心脏病学的未来。
Nat Rev Cardiol. 2025 Mar;22(3):143-144. doi: 10.1038/s41569-024-01105-y.