• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ChatGPT会自我更新吗?ChatGPT在鼓膜置管指导方面的准确性:与当前文献的比较分析。

Does ChatGPT update itself? Accuracy of ChatGPT in tympanostomy tube guidance: A comparative analysis with current literature.

作者信息

Durgut Osman, Dikici Oğuzhan

机构信息

Department of Otorhinolaryngology, Health Science University, Bursa City Hospital, T.C. Sağlık Bakanlığı Bursa Şehir Hastanesi Doğanköy Mahallesi, Nilüfer /Bursa, 16110, Turkey.

出版信息

Eur Arch Otorhinolaryngol. 2025 Aug 23. doi: 10.1007/s00405-025-09630-3.

DOI:10.1007/s00405-025-09630-3
PMID:40849397
Abstract

OBJECTIVE

This study aims to evaluate the accuracy of ChatGPT-4.0 in providing information on tympanostomy tube indications in children, comparing its responses with established clinical guidelines and examining its ability to update itself over time.

METHODS

Sixteen clinical scenarios from the American Academy of Otolaryngology-Head and Neck Surgery Foundation (AAO-HNSF) guidelines were assessed using 18 specific questions. Responses were evaluated by two otolaryngologists and ChatGPT itself. The final validation was conducted by a senior otolaryngologist. Cohen's Kappa analysis was performed to assess inter-rater reliability.

RESULTS

ChatGPT-4.0 correctly answered 15.5 out of 16 scenarios (96.8%). The second-stage question of scenario 7 was evaluated as incorrect. When current literature was referenced, all responses reached 100% accuracy. Among the correct answers, 4 scenarios were not fully aligned with the guidelines. However, when responses were based on current literature, all of these answers were found to be fully compliant. The agreement among the three evaluators was perfect, as confirmed by Cohen's Kappa analysis. Despite using an updated version (ChatGPT-4.0) and over a year having passed, it was observed that ChatGPT-3.5 answered a previously incorrect scenario in the same incorrect manner. This suggests that the model may have limited capacity for self-updating over time. These findings are consistent with previous research, indicating that ChatGPT provides highly accurate responses regarding tympanostomy tube placement and largely aligns with existing guidelines.

CONCLUSION

ChatGPT-4.0 demonstrates high accuracy in providing guideline-based medical information, but its ability to update itself over time appears to be limited. However, when prompted to reference current literature, its accuracy improves significantly. These findings highlight the importance of structured prompting and critical evaluation of AI-generated medical guidance.

摘要

目的

本研究旨在评估ChatGPT-4.0在提供儿童鼓膜置管适应症信息方面的准确性,将其回答与既定临床指南进行比较,并考察其随时间自我更新的能力。

方法

使用18个特定问题评估了美国耳鼻咽喉头颈外科学会基金会(AAO-HNSF)指南中的16个临床场景。两名耳鼻喉科医生和ChatGPT自身对回答进行了评估。最终验证由一名资深耳鼻喉科医生进行。进行了Cohen's Kappa分析以评估评分者间的可靠性。

结果

ChatGPT-4.0在16个场景中正确回答了15.5个(96.8%)。场景7的第二阶段问题被评估为错误。当参考当前文献时,所有回答的准确率达到100%。在正确答案中,有4个场景与指南不完全一致。然而,当回答基于当前文献时,发现所有这些答案都完全符合要求。Cohen's Kappa分析证实,三位评估者之间的一致性极佳。尽管使用了更新版本(ChatGPT-4.0)且已过去一年多时间,但观察到ChatGPT-3.5以相同的错误方式回答了一个之前错误的场景。这表明该模型随时间自我更新的能力可能有限。这些发现与之前的研究一致,表明ChatGPT在鼓膜置管方面提供了高度准确的回答,并且在很大程度上与现有指南一致。

结论

ChatGPT-4.0在提供基于指南的医学信息方面表现出高准确性,但其随时间自我更新的能力似乎有限。然而,当被提示参考当前文献时,其准确性显著提高。这些发现凸显了对人工智能生成的医学指导进行结构化提示和批判性评估的重要性。

相似文献

1
Does ChatGPT update itself? Accuracy of ChatGPT in tympanostomy tube guidance: A comparative analysis with current literature.ChatGPT会自我更新吗?ChatGPT在鼓膜置管指导方面的准确性:与当前文献的比较分析。
Eur Arch Otorhinolaryngol. 2025 Aug 23. doi: 10.1007/s00405-025-09630-3.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Sexual Harassment and Prevention Training性骚扰与预防培训
4
Intravenous magnesium sulphate and sotalol for prevention of atrial fibrillation after coronary artery bypass surgery: a systematic review and economic evaluation.静脉注射硫酸镁和索他洛尔预防冠状动脉搭桥术后房颤:系统评价与经济学评估
Health Technol Assess. 2008 Jun;12(28):iii-iv, ix-95. doi: 10.3310/hta12280.
5
Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗:一项系统综述
Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.
6
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
7
Comparative performance of ChatGPT, Gemini, and final-year emergency medicine clerkship students in answering multiple-choice questions: implications for the use of AI in medical education.ChatGPT、Gemini与急诊医学实习最后一年学生在回答多项选择题方面的表现比较:人工智能在医学教育中的应用启示
Int J Emerg Med. 2025 Aug 7;18(1):146. doi: 10.1186/s12245-025-00949-6.
8
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
9
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病:网络荟萃分析。
Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.
10
Evaluation of ChatGPT-4 as an Online Outpatient Assistant in Puerperal Mastitis Management: Content Analysis of an Observational Study.评估ChatGPT-4作为产褥期乳腺炎管理在线门诊助手的效果:一项观察性研究的内容分析
JMIR Med Inform. 2025 Jul 24;13:e68980. doi: 10.2196/68980.

本文引用的文献

1
Artificial intelligence as an auxiliary tool in pediatric otitis media diagnosis.人工智能作为小儿中耳炎诊断的辅助工具。
Int J Pediatr Otorhinolaryngol. 2024 Dec;187:112154. doi: 10.1016/j.ijporl.2024.112154. Epub 2024 Nov 8.
2
Assessing the accuracy and reproducibility of ChatGPT for responding to patient inquiries about otosclerosis.评估ChatGPT回答患者关于耳硬化症询问的准确性和可重复性。
Eur Arch Otorhinolaryngol. 2025 Mar;282(3):1567-1575. doi: 10.1007/s00405-024-09039-4. Epub 2024 Oct 26.
3
Performance and Consistency of ChatGPT-4 Versus Otolaryngologists: A Clinical Case Series.
ChatGPT-4与耳鼻喉科医生的表现及一致性:临床病例系列
Otolaryngol Head Neck Surg. 2024 Jun;170(6):1519-1526. doi: 10.1002/ohn.759. Epub 2024 Apr 9.
4
Artificial intelligence-powered intraoperative nerve monitoring: a visionary method to reduce facial nerve palsy in parotid surgery: an editorial.人工智能辅助术中神经监测:一种减少腮腺手术中面神经麻痹的前瞻性方法:一篇社论
Ann Med Surg (Lond). 2023 Dec 15;86(2):635-637. doi: 10.1097/MS9.0000000000001612. eCollection 2024 Feb.
5
Assessing the accuracy of ChatGPT references in head and neck and ENT disciplines.评估 ChatGPT 参考文献在头颈部和耳鼻喉科领域的准确性。
Eur Arch Otorhinolaryngol. 2023 Nov;280(11):5129-5133. doi: 10.1007/s00405-023-08205-4. Epub 2023 Sep 8.
6
Developing an Artificial Intelligence Tool to Predict Vocal Cord Pathology in Primary Care Settings.开发一种人工智能工具,以预测初级保健环境中的声带病变。
Laryngoscope. 2023 Aug;133(8):1952-1960. doi: 10.1002/lary.30432. Epub 2022 Oct 13.
7
Digital Approaches to Automated and Machine Learning Assessments of Hearing: Scoping Review.数字化方法在听力的自动化和机器学习评估中的应用:范围综述。
J Med Internet Res. 2022 Feb 2;24(2):e32581. doi: 10.2196/32581.