ChatGPT与谷歌搜索在医学知识获取方面的直接比较

Head-to-Head Comparison of ChatGPT Versus Google Search for Medical Knowledge Acquisition.

作者信息

Ayoub Noel F, Lee Yu-Jin, Grimm David, Divi Vasu

机构信息

Department of Otolaryngology-Head and Neck Surgery, Division of Head & Neck Surgery, Stanford University School of Medicine, Stanford, California, USA.

出版信息

Otolaryngol Head Neck Surg. 2024 Jun;170(6):1484-1491. doi: 10.1002/ohn.465. Epub 2023 Aug 2.

DOI:10.1002/ohn.465

PMID:37529853

Abstract

OBJECTIVE

Chat Generative Pretrained Transformer (ChatGPT) is the newest iteration of OpenAI's generative artificial intelligence (AI) with the potential to influence many facets of life, including health care. This study sought to assess ChatGPT's capabilities as a source of medical knowledge, using Google Search as a comparison.

STUDY DESIGN

Cross-sectional analysis.

SETTING

Online using ChatGPT, Google Seach, and Clinical Practice Guidelines (CPG).

METHODS

CPG Plain Language Summaries for 6 conditions were obtained. Questions relevant to specific conditions were developed and input into ChatGPT and Google Search. All questions were written from the patient perspective and sought (1) general medical knowledge or (2) medical recommendations, with varying levels of acuity (urgent or emergent vs routine clinical scenarios). Two blinded reviewers scored all passages and compared results from ChatGPT and Google Search, using the Patient Education Material Assessment Tool (PEMAT-P) as the primary outcome. Additional customized questions were developed that assessed the medical content of the passages.

RESULTS

The overall average PEMAT-P score for medical advice was 68.2% (standard deviation [SD]: 4.4) for ChatGPT and 89.4% (SD: 5.9) for Google Search (p < .001). There was a statistically significant difference in the PEMAT-P score by source (p < .001) but not by urgency of the clinical situation (p = .613). ChatGPT scored significantly higher than Google Search (87% vs 78%, p = .012) for patient education questions.

CONCLUSION

ChatGPT fared better than Google Search when offering general medical knowledge, but it scored worse when providing medical recommendations. Health care providers should strive to understand the potential benefits and ramifications of generative AI to guide patients appropriately.

摘要

目的

聊天生成预训练变换器（ChatGPT）是OpenAI生成式人工智能（AI）的最新版本，有可能影响生活的许多方面，包括医疗保健。本研究旨在将ChatGPT作为医学知识来源的能力与谷歌搜索进行比较评估。

研究设计

横断面分析。

研究地点

在线使用ChatGPT、谷歌搜索和临床实践指南（CPG）。

方法

获取6种病症的CPG简明语言摘要。针对特定病症编写相关问题，并输入ChatGPT和谷歌搜索。所有问题均从患者角度提出，寻求（1）一般医学知识或（2）医疗建议，涵盖不同紧急程度（紧急或急诊与常规临床情况）。两名盲法评审员对所有段落进行评分，并使用患者教育材料评估工具（PEMAT-P）作为主要结果，比较ChatGPT和谷歌搜索的结果。还开发了额外的定制问题来评估段落的医学内容。

结果

ChatGPT给出的医疗建议的PEMAT-P总体平均得分为68.2%（标准差[SD]：4.4），谷歌搜索为89.4%（SD：5.9）（p < 0.001）。按来源划分，PEMAT-P得分存在统计学显著差异（p < 0.001），但按临床情况的紧急程度划分则无差异（p = 0.613）。对于患者教育问题，ChatGPT的得分显著高于谷歌搜索（87%对78%，p = 0.012）。

结论

在提供一般医学知识方面，ChatGPT的表现优于谷歌搜索，但在提供医疗建议时得分较低。医疗保健提供者应努力了解生成式AI的潜在益处和影响，以便恰当地指导患者。

相似文献

Head-to-Head Comparison of ChatGPT Versus Google Search for Medical Knowledge Acquisition.

Otolaryngol Head Neck Surg. 2024 Jun;170(6):1484-1491. doi: 10.1002/ohn.465. Epub 2023 Aug 2.

Using Google web search to analyze and evaluate the application of ChatGPT in femoroacetabular impingement syndrome.

Front Public Health. 2024 May 31;12:1412063. doi: 10.3389/fpubh.2024.1412063. eCollection 2024.

Evaluating ChatGPT's Utility in Medicine Guidelines Through Web Search Analysis.

Perm J. 2024 Jun 14;28(2):55-69. doi: 10.7812/TPP/23.126. Epub 2024 Apr 26.

ChatGPT vs. web search for patient questions: what does ChatGPT do better?

Eur Arch Otorhinolaryngol. 2024 Jun;281(6):3219-3225. doi: 10.1007/s00405-024-08524-0. Epub 2024 Feb 28.

Do ChatGPT and Google differ in answers to commonly asked patient questions regarding total shoulder and total elbow arthroplasty?

J Shoulder Elbow Surg. 2024 Aug;33(8):e429-e437. doi: 10.1016/j.jse.2023.11.014. Epub 2024 Jan 3.

Artificial intelligence chatbots as sources of patient education material for cataract surgery: ChatGPT-4 versus Google Bard.

BMJ Open Ophthalmol. 2024 Oct 17;9(1):e001824. doi: 10.1136/bmjophth-2024-001824.

Dr. Google vs. Dr. ChatGPT: Exploring the Use of Artificial Intelligence in Ophthalmology by Comparing the Accuracy, Safety, and Readability of Responses to Frequently Asked Patient Questions Regarding Cataracts and Cataract Surgery.

Semin Ophthalmol. 2024 Aug;39(6):472-479. doi: 10.1080/08820538.2024.2326058. Epub 2024 Mar 22.

BPPV Information on Google Versus AI (ChatGPT).

Otolaryngol Head Neck Surg. 2024 Jun;170(6):1504-1511. doi: 10.1002/ohn.506. Epub 2023 Aug 25.

Application of artificial intelligence chatbots, including ChatGPT, in education, scholarly work, programming, and content generation and its prospects: a narrative review.

J Educ Eval Health Prof. 2023;20:38. doi: 10.3352/jeehp.2023.20.38. Epub 2023 Dec 27.

Evaluation of the Performance of Generative AI Large Language Models ChatGPT, Google Bard, and Microsoft Bing Chat in Supporting Evidence-Based Dentistry: Comparative Mixed Methods Study.

J Med Internet Res. 2023 Dec 28;25:e51580. doi: 10.2196/51580.

引用本文的文献

Evaluating ChatGPT's Utility in Biologic Therapy for Systemic Lupus Erythematosus: Comparative Study of ChatGPT and Google Web Search.

JMIR Form Res. 2025 Aug 28;9:e76458. doi: 10.2196/76458.

Evaluating ChatGPT's Concordance with Clinical Guidelines of Ménière's Disease in Chinese.

Diagnostics (Basel). 2025 Aug 11;15(16):2006. doi: 10.3390/diagnostics15162006.

Evaluating a Nationally Localized AI Chatbot for Personalized Primary Care Guidance: Insights from the HomeDOCtor Deployment in Slovenia.

Healthcare (Basel). 2025 Jul 29;13(15):1843. doi: 10.3390/healthcare13151843.

Generative AI/LLMs for Plain Language Medical Information for Patients, Caregivers and General Public: Opportunities, Risks and Ethics.

Patient Prefer Adherence. 2025 Jul 31;19:2227-2249. doi: 10.2147/PPA.S527922. eCollection 2025.

Patient agency and large language models in worldwide encoding of equity.

NPJ Digit Med. 2025 May 8;8(1):258. doi: 10.1038/s41746-025-01598-y.

Assessing the quality and readability of patient education materials on chemotherapy cardiotoxicity from artificial intelligence chatbots: An observational cross-sectional study.

Medicine (Baltimore). 2025 Apr 11;104(15):e42135. doi: 10.1097/MD.0000000000042135.

From Engagement to Concerns: Social Media Use Among a Sample of Australian Public Health Professionals.

Health Promot J Austr. 2025 Apr;36(2):e70035. doi: 10.1002/hpja.70035.

ChatGPT, Google, or PINK? Who Provides the Most Reliable Information on Side Effects of Systemic Therapy for Early Breast Cancer?

Clin Pract. 2024 Dec 31;15(1):8. doi: 10.3390/clinpract15010008.

Current applications and challenges in large language models for patient care: a systematic review.

Commun Med (Lond). 2025 Jan 21;5(1):26. doi: 10.1038/s43856-024-00717-2.

The Goldilocks Zone: Finding the right balance of user and institutional risk for suicide-related generative AI queries.

PLOS Digit Health. 2025 Jan 8;4(1):e0000711. doi: 10.1371/journal.pdig.0000711. eCollection 2025 Jan.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

ChatGPT与谷歌搜索在医学知识获取方面的直接比较

Head-to-Head Comparison of ChatGPT Versus Google Search for Medical Knowledge Acquisition.

作者信息

机构信息

出版信息

OBJECTIVE

STUDY DESIGN

SETTING

METHODS

RESULTS

CONCLUSION

目的

研究设计

研究地点

方法

结果

结论

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献