ChatGPT 对耳鼻喉科临床实践指南的遵循情况。

ChatGPT's adherence to otolaryngology clinical practice guidelines.

机构信息

Department of Otolaryngology and Head and Neck Surgery, Sheba Medical Center, Ramat Gan, Israel.

School of Medicine, Tel Aviv University, Tel Aviv, Israel.

出版信息

Eur Arch Otorhinolaryngol. 2024 Jul;281(7):3829-3834. doi: 10.1007/s00405-024-08634-9. Epub 2024 Apr 22.

DOI:10.1007/s00405-024-08634-9

PMID:38647684

Abstract

OBJECTIVES

Large language models, including ChatGPT, has the potential to transform the way we approach medical knowledge, yet accuracy in clinical topics is critical. Here we assessed ChatGPT's performance in adhering to the American Academy of Otolaryngology-Head and Neck Surgery guidelines.

METHODS

We presented ChatGPT with 24 clinical otolaryngology questions based on the guidelines of the American Academy of Otolaryngology. This was done three times (N = 72) to test the model's consistency. Two otolaryngologists evaluated the responses for accuracy and relevance to the guidelines. Cohen's Kappa was used to measure evaluator agreement, and Cronbach's alpha assessed the consistency of ChatGPT's responses.

RESULTS

The study revealed mixed results; 59.7% (43/72) of ChatGPT's responses were highly accurate, while only 2.8% (2/72) directly contradicted the guidelines. The model showed 100% accuracy in Head and Neck, but lower accuracy in Rhinology and Otology/Neurotology (66%), Laryngology (50%), and Pediatrics (8%). The model's responses were consistent in 17/24 (70.8%), with a Cronbach's alpha value of 0.87, indicating a reasonable consistency across tests.

CONCLUSIONS

Using a guideline-based set of structured questions, ChatGPT demonstrates consistency but variable accuracy in otolaryngology. Its lower performance in some areas, especially Pediatrics, suggests that further rigorous evaluation is needed before considering real-world clinical use.

摘要

目的

大型语言模型，包括 ChatGPT，有可能改变我们对待医学知识的方式，但在临床主题上的准确性是至关重要的。在这里，我们评估了 ChatGPT 遵守美国耳鼻喉科学会-头颈外科学会指南的能力。

方法

我们向 ChatGPT 提出了 24 个基于美国耳鼻喉科学会指南的临床耳鼻喉科问题。这是三次进行的（N=72），以测试模型的一致性。两名耳鼻喉科医生评估了回答的准确性和与指南的相关性。使用 Cohen's Kappa 来衡量评估者的一致性，Cronbach's alpha 评估 ChatGPT 回答的一致性。

结果

研究结果喜忧参半；ChatGPT 59.7%（43/72）的回答非常准确，而只有 2.8%（2/72）直接与指南相悖。该模型在头颈部显示出 100%的准确率，但在鼻科学、耳科学/神经耳科学（66%）、喉科学（50%）和儿科学（8%）的准确率较低。该模型在 24 个问题中的 17 个（70.8%）回答是一致的，Cronbach's alpha 值为 0.87，表明在测试之间有一定的一致性。

结论

使用基于指南的一套结构化问题，ChatGPT 在耳鼻喉科方面表现出了一致性，但准确性不同。它在某些领域，特别是儿科学方面的表现较差，这表明在考虑实际临床应用之前，还需要进行更严格的评估。

相似文献

ChatGPT's adherence to otolaryngology clinical practice guidelines.

Eur Arch Otorhinolaryngol. 2024 Jul;281(7):3829-3834. doi: 10.1007/s00405-024-08634-9. Epub 2024 Apr 22.

ChatGPT's Performance in Cardiac Arrest and Bradycardia Simulations Using the American Heart Association's Advanced Cardiovascular Life Support Guidelines: Exploratory Study.

J Med Internet Res. 2024 Apr 22;26:e55037. doi: 10.2196/55037.

ChatGPT's quiz skills in different otolaryngology subspecialties: an analysis of 2576 single-choice and multiple-choice board certification preparation questions.

Eur Arch Otorhinolaryngol. 2023 Sep;280(9):4271-4278. doi: 10.1007/s00405-023-08051-4. Epub 2023 Jun 7.

ChatGPT's performance in German OB/GYN exams - paving the way for AI-enhanced medical education and clinical practice.

Front Med (Lausanne). 2023 Dec 13;10:1296615. doi: 10.3389/fmed.2023.1296615. eCollection 2023.

Assessing ChatGPT's Responses to Otolaryngology Patient Questions.

Ann Otol Rhinol Laryngol. 2024 Jul;133(7):658-664. doi: 10.1177/00034894241249621. Epub 2024 Apr 27.

Assessing question characteristic influences on ChatGPT's performance and response-explanation consistency: Insights from Taiwan's Nursing Licensing Exam.

Int J Nurs Stud. 2024 May;153:104717. doi: 10.1016/j.ijnurstu.2024.104717. Epub 2024 Feb 8.

Performance of ChatGPT on the Chinese Postgraduate Examination for Clinical Medicine: Survey Study.

JMIR Med Educ. 2024 Feb 9;10:e48514. doi: 10.2196/48514.

Evaluation of the accuracy of ChatGPT's responses to and references for clinical questions in physical therapy.

J Phys Ther Sci. 2024 May;36(5):234-239. doi: 10.1589/jpts.36.234. Epub 2024 May 1.

A Novel Evaluation Model for Assessing ChatGPT on Otolaryngology-Head and Neck Surgery Certification Examinations: Performance Study.

JMIR Med Educ. 2024 Jan 16;10:e49970. doi: 10.2196/49970.

Examining the Performance of ChatGPT 3.5 and Microsoft Copilot in Otolaryngology: A Comparative Study with Otolaryngologists' Evaluation.

Indian J Otolaryngol Head Neck Surg. 2024 Aug;76(4):3465-3469. doi: 10.1007/s12070-024-04729-1. Epub 2024 May 1.

引用本文的文献

Diagnostic Performance of ChatGPT-4o in Analyzing Oral Mucosal Lesions: A Comparative Study with Experts.

Medicina (Kaunas). 2025 Jul 30;61(8):1379. doi: 10.3390/medicina61081379.

The Application and Diagnostic Accuracy of Artificial Intelligence in Rhinology: A Review.

Cureus. 2025 Jul 15;17(7):e87966. doi: 10.7759/cureus.87966. eCollection 2025 Jul.

ChatGPT versus DeepSeek in head and neck cancer staging and treatment planning: guideline-based study.

Eur Arch Otorhinolaryngol. 2025 Jun 17. doi: 10.1007/s00405-025-09524-4.

Applications of Natural Language Processing in Otolaryngology: A Scoping Review.

Laryngoscope. 2025 Sep;135(9):3049-3063. doi: 10.1002/lary.32198. Epub 2025 May 1.

Evaluation of Large Language Models' Concordance With Guidelines on Olfaction.

Laryngoscope Investig Otolaryngol. 2025 Mar 22;10(2):e70130. doi: 10.1002/lio2.70130. eCollection 2025 Apr.

Evaluation of ChatGPT-4's Performance in Therapeutic Decision-Making During Multidisciplinary Oncology Meetings for Head and Neck Squamous Cell Carcinoma.

Cureus. 2024 Sep 6;16(9):e68808. doi: 10.7759/cureus.68808. eCollection 2024 Sep.

Accuracy of ChatGPT responses on tracheotomy for patient education.

Eur Arch Otorhinolaryngol. 2024 Nov;281(11):6167-6172. doi: 10.1007/s00405-024-08859-8. Epub 2024 Oct 2.

本文引用的文献

Advancing Medical Practice with Artificial Intelligence: ChatGPT in Healthcare.

Isr Med Assoc J. 2024 Feb;26(2):80-85.

Exploring the role of ChatGPT in clinical decision-making in otorhinolaryngology: a ChatGPT designed study.

Eur Arch Otorhinolaryngol. 2024 Apr;281(4):2023-2030. doi: 10.1007/s00405-024-08498-z. Epub 2024 Feb 12.

ChatGPT-4 performance in rhinology: A clinical case series.

Int Forum Allergy Rhinol. 2024 Jun;14(6):1123-1130. doi: 10.1002/alr.23323. Epub 2024 Jan 24.

Impact of Sarcopenia on Survival of Patients With Malignant Salivary Glands Tumors.

Otolaryngol Head Neck Surg. 2024 Mar;170(3):837-844. doi: 10.1002/ohn.608. Epub 2023 Nov 29.

Evaluating the Current Ability of ChatGPT to Assist in Professional Otolaryngology Education.

OTO Open. 2023 Nov 22;7(4):e94. doi: 10.1002/oto2.94. eCollection 2023 Oct-Dec.

Diagnostic and Management Applications of ChatGPT in Structured Otolaryngology Clinical Scenarios.

OTO Open. 2023 Aug 22;7(3):e67. doi: 10.1002/oto2.67. eCollection 2023 Jul-Sep.

Ethical Considerations of Using ChatGPT in Health Care.

J Med Internet Res. 2023 Aug 11;25:e48009. doi: 10.2196/48009.

Sailing the Seven Seas: A Multinational Comparison of ChatGPT's Performance on Medical Licensing Examinations.

Ann Biomed Eng. 2024 Jun;52(6):1542-1545. doi: 10.1007/s10439-023-03338-3. Epub 2023 Aug 8.

Performance of the Large Language Model ChatGPT on the National Nurse Examinations in Japan: Evaluation Study.

JMIR Nurs. 2023 Jun 27;6:e47305. doi: 10.2196/47305.

Bias in AI-based models for medical applications: challenges and mitigation strategies.

NPJ Digit Med. 2023 Jun 14;6(1):113. doi: 10.1038/s41746-023-00858-z.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

ChatGPT 对耳鼻喉科临床实践指南的遵循情况。

ChatGPT's adherence to otolaryngology clinical practice guidelines.

机构信息

出版信息

OBJECTIVES

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献