ChatGPT-4o在眩晕相关疾病的鉴别诊断与管理中的作用。

The role of ChatGPT-4o in differential diagnosis and management of vertigo-related disorders.

作者信息

Liu Xu, Shi Suming, Zhang Xin, Gao Qianwen, Wang Wuqing

机构信息

ENT Institute, Department of Otorhinolaryngology, Eye & ENT Hospital, Fudan University, Shanghai, 200031, China.

NHC Key Laboratory of Hearing Medicine (Fudan University), Shanghai, 200031, China.

出版信息

Sci Rep. 2025 May 28;15(1):18688. doi: 10.1038/s41598-025-96309-8.

DOI:10.1038/s41598-025-96309-8

PMID:40437044

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12119837/

Abstract

To compare the diagnostic accuracy of an artificial intelligence chatbot and clinical experts in vertigo-related diseases and evaluate the ability of the AI chatbot to address vertigo-related issues. 20 clinical questions about vertigo were input to ChatGPT-4o, and three otologists evaluated the responses using a 5-point Likert scale for accuracy, comprehensiveness, clarity, practicality, and credibility. Readability was assessed using Flesch Reading Ease and Flesch-Kincaid Grade Level formulas. The model and two otologists diagnosed 15 outpatient vertigo cases, and the diagnostic accuracy was calculated. The Kruskal-Wallis test, Analysis of Variance (ANOVA), and paired t-test were employed for statistical analysis. ChatGPT-4o scored highest in credibility (4.78). Repeated Measures ANOVA showed that ChatGPT's responses to the 20 questions exhibited statistically significant differences across the five scoring dimensions (F = 2.682, p = 0.038). Readability analysis showed that diagnosis-related outputs were more challenging compared to other types of content. The model's diagnostic accuracy was comparable to a clinician with one year of experience but inferior to a clinician with five years of experience, and the differences in accuracy among the three methods are statistically significant (p = 0.04). ChatGPT-4o shows promise as a supplementary tool for managing vertigo but requires improvements in readability and diagnostic capabilities.

摘要

比较人工智能聊天机器人和临床专家对眩晕相关疾病的诊断准确性，并评估人工智能聊天机器人解决眩晕相关问题的能力。将20个关于眩晕的临床问题输入ChatGPT-4o，三位耳科医生使用5点李克特量表对回答的准确性、全面性、清晰度、实用性和可信度进行评估。使用弗莱什易读性公式和弗莱什-金凯德年级水平公式评估可读性。该模型和两位耳科医生对15例门诊眩晕病例进行诊断，并计算诊断准确性。采用Kruskal-Wallis检验、方差分析（ANOVA）和配对t检验进行统计分析。ChatGPT-4o在可信度方面得分最高（4.78）。重复测量方差分析表明，ChatGPT对20个问题的回答在五个评分维度上存在统计学显著差异（F = 2.682，p = 0.038）。可读性分析表明，与其他类型的内容相比，与诊断相关的输出更具挑战性。该模型的诊断准确性与有一年经验的临床医生相当，但低于有五年经验的临床医生，三种方法在准确性上的差异具有统计学意义（p = 0.04）。ChatGPT-4o作为管理眩晕的辅助工具显示出前景，但在可读性和诊断能力方面需要改进。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8bcc/12119837/27b936f12e12/41598_2025_96309_Fig1_HTML.jpg

相似文献

The role of ChatGPT-4o in differential diagnosis and management of vertigo-related disorders.ChatGPT-4o在眩晕相关疾病的鉴别诊断与管理中的作用。

Sci Rep. 2025 May 28;15(1):18688. doi: 10.1038/s41598-025-96309-8.

Assessing the Quality and Reliability of ChatGPT's Responses to Radiotherapy-Related Patient Queries: Comparative Study With GPT-3.5 and GPT-4.评估ChatGPT对放疗相关患者问题回答的质量和可靠性：与GPT-3.5和GPT-4的比较研究

JMIR Cancer. 2025 Apr 16;11:e63677. doi: 10.2196/63677.

Assessing the clinical support capabilities of ChatGPT 4o and ChatGPT 4o mini in managing lumbar disc herniation.评估ChatGPT 4o和ChatGPT 4o mini在管理腰椎间盘突出症方面的临床支持能力。

Eur J Med Res. 2025 Jan 22;30(1):45. doi: 10.1186/s40001-025-02296-x.

Comparative analysis of ChatGPT-4o mini, ChatGPT-4o and Gemini Advanced in the treatment of postmenopausal osteoporosis.ChatGPT-4o mini、ChatGPT-4o与Gemini Advanced在绝经后骨质疏松症治疗中的对比分析。

BMC Musculoskelet Disord. 2025 Apr 16;26(1):369. doi: 10.1186/s12891-025-08601-3.

Accuracy and Readability of Artificial Intelligence Chatbot Responses to Vasectomy-Related Questions: Public Beware.人工智能聊天机器人对输精管切除术相关问题回答的准确性和可读性：公众需谨慎。

Cureus. 2024 Aug 28;16(8):e67996. doi: 10.7759/cureus.67996. eCollection 2024 Aug.

Both Patients and Plastic Surgeons Prefer Artificial Intelligence-Generated Microsurgical Information.患者和整形外科医生都更喜欢人工智能生成的显微手术信息。

J Reconstr Microsurg. 2024 Nov;40(9):657-664. doi: 10.1055/a-2273-4163. Epub 2024 Feb 21.

Accuracy and Readability of ChatGPT Responses to Patient-Centric Strabismus Questions.ChatGPT对以患者为中心的斜视问题的回答的准确性和可读性。

J Pediatr Ophthalmol Strabismus. 2025 May-Jun;62(3):220-227. doi: 10.3928/01913913-20250110-02. Epub 2025 Feb 19.

Evaluating the role of AI chatbots in patient education for abdominal aortic aneurysms: a comparison of ChatGPT and conventional resources.评估人工智能聊天机器人在腹主动脉瘤患者教育中的作用：ChatGPT与传统资源的比较

ANZ J Surg. 2025 Apr;95(4):784-788. doi: 10.1111/ans.70053. Epub 2025 Mar 5.

Comparative evaluation of the accuracy and reliability of ChatGPT versions in providing information on infection.ChatGPT不同版本在提供感染相关信息方面的准确性和可靠性的比较评估

Front Public Health. 2025 May 15;13:1566982. doi: 10.3389/fpubh.2025.1566982. eCollection 2025.

Generative artificial intelligence chatbots may provide appropriate informational responses to common vascular surgery questions by patients.生成式人工智能聊天机器人可能会为患者关于常见血管外科问题提供恰当的信息性回复。

Vascular. 2025 Feb;33(1):229-237. doi: 10.1177/17085381241240550. Epub 2024 Mar 18.

本文引用的文献

Enhancing AI Chatbot Responses in Health Care: The SMART Prompt Structure in Head and Neck Surgery.增强医疗保健中的人工智能聊天机器人回复：头颈外科的SMART提示结构

OTO Open. 2025 Jan 16;9(1):e70075. doi: 10.1002/oto2.70075. eCollection 2025 Jan-Mar.

Artificial Intelligence in Audiology: A Scoping Review of Current Applications and Future Directions.人工智能在听力学中的应用：现状与未来方向的范围综述。

Sensors (Basel). 2024 Nov 6;24(22):7126. doi: 10.3390/s24227126.

Reliability of large language models for advanced head and neck malignancies management: a comparison between ChatGPT 4 and Gemini Advanced.大型语言模型在高级头颈部恶性肿瘤管理中的可靠性：ChatGPT 4 与 Gemini Advanced 之间的比较。

Eur Arch Otorhinolaryngol. 2024 Sep;281(9):5001-5006. doi: 10.1007/s00405-024-08746-2. Epub 2024 May 25.

Validation of the Quality Analysis of Medical Artificial Intelligence (QAMAI) tool: a new tool to assess the quality of health information provided by AI platforms.验证医学人工智能质量分析（QAMAI）工具：一种评估人工智能平台提供的健康信息质量的新工具。

Eur Arch Otorhinolaryngol. 2024 Nov;281(11):6123-6131. doi: 10.1007/s00405-024-08710-0. Epub 2024 May 4.

To trust or not to trust: evaluating the reliability and safety of AI responses to laryngeal cancer queries.信任还是不信任：评估人工智能对喉癌查询的回应的可靠性和安全性。

Eur Arch Otorhinolaryngol. 2024 Nov;281(11):6069-6081. doi: 10.1007/s00405-024-08643-8. Epub 2024 Apr 23.

Performance and Consistency of ChatGPT-4 Versus Otolaryngologists: A Clinical Case Series.ChatGPT-4与耳鼻喉科医生的表现及一致性：临床病例系列

Otolaryngol Head Neck Surg. 2024 Jun;170(6):1519-1526. doi: 10.1002/ohn.759. Epub 2024 Apr 9.

Machine learning models help differentiate between causes of recurrent spontaneous vertigo.机器学习模型有助于区分复发性自发性眩晕的病因。

J Neurol. 2024 Jun;271(6):3426-3438. doi: 10.1007/s00415-023-11997-4. Epub 2024 Mar 23.

Multimodal deep learning-based diagnostic model for BPPV.基于多模态深度学习的 BPPV 诊断模型。

BMC Med Inform Decis Mak. 2024 Mar 21;24(1):82. doi: 10.1186/s12911-024-02438-x.

ChatGPT-4 accuracy for patient education in laryngopharyngeal reflux.ChatGPT-4 在咽喉反流患者教育中的准确性。

Eur Arch Otorhinolaryngol. 2024 May;281(5):2547-2552. doi: 10.1007/s00405-024-08560-w. Epub 2024 Mar 16.

Opportunities, challenges, and future directions of large language models, including ChatGPT in medical education: a systematic scoping review.大型语言模型（包括 ChatGPT 在医学教育中的应用）的机遇、挑战及未来发展方向：系统范围界定综述。

J Educ Eval Health Prof. 2024;21:6. doi: 10.3352/jeehp.2024.21.6. Epub 2024 Mar 15.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

ChatGPT-4o在眩晕相关疾病的鉴别诊断与管理中的作用。

The role of ChatGPT-4o in differential diagnosis and management of vertigo-related disorders.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献