文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

ChatGPT与谷歌搜索在医学知识获取方面的直接比较

Head-to-Head Comparison of ChatGPT Versus Google Search for Medical Knowledge Acquisition.

作者信息

Ayoub Noel F, Lee Yu-Jin, Grimm David, Divi Vasu

机构信息

Department of Otolaryngology-Head and Neck Surgery, Division of Head & Neck Surgery, Stanford University School of Medicine, Stanford, California, USA.

出版信息

Otolaryngol Head Neck Surg. 2024 Jun;170(6):1484-1491. doi: 10.1002/ohn.465. Epub 2023 Aug 2.


DOI:10.1002/ohn.465
PMID:37529853
Abstract

OBJECTIVE: Chat Generative Pretrained Transformer (ChatGPT) is the newest iteration of OpenAI's generative artificial intelligence (AI) with the potential to influence many facets of life, including health care. This study sought to assess ChatGPT's capabilities as a source of medical knowledge, using Google Search as a comparison. STUDY DESIGN: Cross-sectional analysis. SETTING: Online using ChatGPT, Google Seach, and Clinical Practice Guidelines (CPG). METHODS: CPG Plain Language Summaries for 6 conditions were obtained. Questions relevant to specific conditions were developed and input into ChatGPT and Google Search. All questions were written from the patient perspective and sought (1) general medical knowledge or (2) medical recommendations, with varying levels of acuity (urgent or emergent vs routine clinical scenarios). Two blinded reviewers scored all passages and compared results from ChatGPT and Google Search, using the Patient Education Material Assessment Tool (PEMAT-P) as the primary outcome. Additional customized questions were developed that assessed the medical content of the passages. RESULTS: The overall average PEMAT-P score for medical advice was 68.2% (standard deviation [SD]: 4.4) for ChatGPT and 89.4% (SD: 5.9) for Google Search (p < .001). There was a statistically significant difference in the PEMAT-P score by source (p < .001) but not by urgency of the clinical situation (p = .613). ChatGPT scored significantly higher than Google Search (87% vs 78%, p = .012) for patient education questions. CONCLUSION: ChatGPT fared better than Google Search when offering general medical knowledge, but it scored worse when providing medical recommendations. Health care providers should strive to understand the potential benefits and ramifications of generative AI to guide patients appropriately.

摘要

目的:聊天生成预训练变换器(ChatGPT)是OpenAI生成式人工智能(AI)的最新版本,有可能影响生活的许多方面,包括医疗保健。本研究旨在将ChatGPT作为医学知识来源的能力与谷歌搜索进行比较评估。 研究设计:横断面分析。 研究地点:在线使用ChatGPT、谷歌搜索和临床实践指南(CPG)。 方法:获取6种病症的CPG简明语言摘要。针对特定病症编写相关问题,并输入ChatGPT和谷歌搜索。所有问题均从患者角度提出,寻求(1)一般医学知识或(2)医疗建议,涵盖不同紧急程度(紧急或急诊与常规临床情况)。两名盲法评审员对所有段落进行评分,并使用患者教育材料评估工具(PEMAT-P)作为主要结果,比较ChatGPT和谷歌搜索的结果。还开发了额外的定制问题来评估段落的医学内容。 结果:ChatGPT给出的医疗建议的PEMAT-P总体平均得分为68.2%(标准差[SD]:4.4),谷歌搜索为89.4%(SD:5.9)(p < 0.001)。按来源划分,PEMAT-P得分存在统计学显著差异(p < 0.001),但按临床情况的紧急程度划分则无差异(p = 0.613)。对于患者教育问题,ChatGPT的得分显著高于谷歌搜索(87%对78%,p = 0.012)。 结论:在提供一般医学知识方面,ChatGPT的表现优于谷歌搜索,但在提供医疗建议时得分较低。医疗保健提供者应努力了解生成式AI的潜在益处和影响,以便恰当地指导患者。

相似文献

[1]
Head-to-Head Comparison of ChatGPT Versus Google Search for Medical Knowledge Acquisition.

Otolaryngol Head Neck Surg. 2024-6

[2]
Using Google web search to analyze and evaluate the application of ChatGPT in femoroacetabular impingement syndrome.

Front Public Health. 2024

[3]
Evaluating ChatGPT's Utility in Medicine Guidelines Through Web Search Analysis.

Perm J. 2024-6-14

[4]
ChatGPT vs. web search for patient questions: what does ChatGPT do better?

Eur Arch Otorhinolaryngol. 2024-6

[5]
Do ChatGPT and Google differ in answers to commonly asked patient questions regarding total shoulder and total elbow arthroplasty?

J Shoulder Elbow Surg. 2024-8

[6]
Artificial intelligence chatbots as sources of patient education material for cataract surgery: ChatGPT-4 versus Google Bard.

BMJ Open Ophthalmol. 2024-10-17

[7]
Dr. Google vs. Dr. ChatGPT: Exploring the Use of Artificial Intelligence in Ophthalmology by Comparing the Accuracy, Safety, and Readability of Responses to Frequently Asked Patient Questions Regarding Cataracts and Cataract Surgery.

Semin Ophthalmol. 2024-8

[8]
BPPV Information on Google Versus AI (ChatGPT).

Otolaryngol Head Neck Surg. 2024-6

[9]
Application of artificial intelligence chatbots, including ChatGPT, in education, scholarly work, programming, and content generation and its prospects: a narrative review.

J Educ Eval Health Prof. 2023

[10]
Evaluation of the Performance of Generative AI Large Language Models ChatGPT, Google Bard, and Microsoft Bing Chat in Supporting Evidence-Based Dentistry: Comparative Mixed Methods Study.

J Med Internet Res. 2023-12-28

引用本文的文献

[1]
Evaluating ChatGPT's Utility in Biologic Therapy for Systemic Lupus Erythematosus: Comparative Study of ChatGPT and Google Web Search.

JMIR Form Res. 2025-8-28

[2]
Evaluating ChatGPT's Concordance with Clinical Guidelines of Ménière's Disease in Chinese.

Diagnostics (Basel). 2025-8-11

[3]
Evaluating a Nationally Localized AI Chatbot for Personalized Primary Care Guidance: Insights from the HomeDOCtor Deployment in Slovenia.

Healthcare (Basel). 2025-7-29

[4]
Generative AI/LLMs for Plain Language Medical Information for Patients, Caregivers and General Public: Opportunities, Risks and Ethics.

Patient Prefer Adherence. 2025-7-31

[5]
Patient agency and large language models in worldwide encoding of equity.

NPJ Digit Med. 2025-5-8

[6]
Assessing the quality and readability of patient education materials on chemotherapy cardiotoxicity from artificial intelligence chatbots: An observational cross-sectional study.

Medicine (Baltimore). 2025-4-11

[7]
From Engagement to Concerns: Social Media Use Among a Sample of Australian Public Health Professionals.

Health Promot J Austr. 2025-4

[8]
ChatGPT, Google, or PINK? Who Provides the Most Reliable Information on Side Effects of Systemic Therapy for Early Breast Cancer?

Clin Pract. 2024-12-31

[9]
Current applications and challenges in large language models for patient care: a systematic review.

Commun Med (Lond). 2025-1-21

[10]
The Goldilocks Zone: Finding the right balance of user and institutional risk for suicide-related generative AI queries.

PLOS Digit Health. 2025-1-8

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索