追寻睡眠医学专家：ChatGPT-4o对多导睡眠图结果的解读

Chasing sleep physicians: ChatGPT-4o on the interpretation of polysomnographic results.

作者信息

Seifen Christopher, Huppertz Tilman, Gouveris Haralampos, Bahr-Hamm Katharina, Pordzik Johannes, Eckrich Jonas, Smith Harry, Kelsey Tom, Blaikie Andrew, Matthias Christoph, Kuhn Sebastian, Buhr Christoph Raphael

机构信息

Sleep Medicine Center & Department of Otolaryngology, Head and Neck Surgery, University Medical Center Mainz, Mainz, Germany.

School of Computer Science, University of St Andrews, St Andrews, UK.

出版信息

Eur Arch Otorhinolaryngol. 2025 Mar;282(3):1631-1639. doi: 10.1007/s00405-024-08985-3. Epub 2024 Oct 20.

DOI:10.1007/s00405-024-08985-3

PMID:39427271

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11890353/

Abstract

BACKGROUND

From a healthcare professional's perspective, the use of ChatGPT (Open AI), a large language model (LLM), offers huge potential as a practical and economic digital assistant. However, ChatGPT has not yet been evaluated for the interpretation of polysomnographic results in patients with suspected obstructive sleep apnea (OSA).

AIMS/OBJECTIVES: To evaluate the agreement of polysomnographic result interpretation between ChatGPT-4o and a board-certified sleep physician and to shed light into the role of ChatGPT-4o in the field of medical decision-making in sleep medicine.

MATERIAL AND METHODS

For this proof-of-concept study, 40 comprehensive patient profiles were designed, which represent a broad and typical spectrum of cases, ensuring a balanced distribution of demographics and clinical characteristics. After various prompts were tested, one prompt was used for initial diagnosis of OSA and a further for patients with positive airway pressure (PAP) therapy intolerance. Each polysomnographic result was independently evaluated by ChatGPT-4o and a board-certified sleep physician. Diagnosis and therapy suggestions were analyzed for agreement.

RESULTS

ChatGPT-4o and the sleep physician showed 97% (29/30) concordance in the diagnosis of the simple cases. For the same cases the two assessment instances unveiled 100% (30/30) concordance regarding therapy suggestions. For cases with intolerance of treatment with positive airway pressure (PAP) ChatGPT-4o and the sleep physician revealed 70% (7/10) concordance in the diagnosis and 44% (22/50) concordance for therapy suggestions.

CONCLUSION AND SIGNIFICANCE

Precise prompting improves the output of ChatGPT-4o and provides sleep physician-like polysomnographic result interpretation. Although ChatGPT shows some shortcomings in offering treatment advice, our results provide evidence for AI assisted automation and economization of polysomnographic interpretation by LLMs. Further research should explore data protection issues and demonstrate reproducibility with real patient data on a larger scale.

摘要

背景

从医疗保健专业人员的角度来看，使用大型语言模型（LLM）ChatGPT（OpenAI）作为实用且经济的数字助手具有巨大潜力。然而，ChatGPT尚未针对疑似阻塞性睡眠呼吸暂停（OSA）患者的多导睡眠图结果解读进行评估。

目的

评估ChatGPT-4o与经过委员会认证的睡眠医生在多导睡眠图结果解读方面的一致性，并阐明ChatGPT-4o在睡眠医学医疗决策领域的作用。

材料与方法

对于这项概念验证研究，设计了40份全面的患者资料，代表广泛且典型的病例谱，确保人口统计学和临床特征的均衡分布。在测试了各种提示后，一个提示用于OSA的初步诊断，另一个用于气道正压通气（PAP）治疗不耐受的患者。每个多导睡眠图结果由ChatGPT-4o和经过委员会认证的睡眠医生独立评估。分析诊断和治疗建议的一致性。

结果

在简单病例的诊断中，ChatGPT-4o与睡眠医生的一致性为97%（29/30）。对于相同病例，两个评估实例在治疗建议方面的一致性为100%（30/30）。对于气道正压通气（PAP）治疗不耐受的病例，ChatGPT-4o与睡眠医生在诊断方面的一致性为70%（7/10），在治疗建议方面的一致性为44%（22/50）。

结论与意义

精确的提示可改善ChatGPT-4o的输出，并提供类似睡眠医生的多导睡眠图结果解读。尽管ChatGPT在提供治疗建议方面存在一些不足，但我们的结果为大型语言模型辅助多导睡眠图解读的自动化和经济化提供了证据。进一步的研究应探讨数据保护问题，并在更大规模上用真实患者数据证明可重复性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f3ac/11890353/181d73be02d2/405_2024_8985_Fig1_HTML.jpg

相似文献

Chasing sleep physicians: ChatGPT-4o on the interpretation of polysomnographic results.追寻睡眠医学专家：ChatGPT-4o对多导睡眠图结果的解读

Eur Arch Otorhinolaryngol. 2025 Mar;282(3):1631-1639. doi: 10.1007/s00405-024-08985-3. Epub 2024 Oct 20.

Assessment of decision-making with locally run and web-based large language models versus human board recommendations in otorhinolaryngology, head and neck surgery.在耳鼻喉科、头颈外科中，评估本地运行和基于网络的大语言模型与人类委员会建议的决策情况。

Eur Arch Otorhinolaryngol. 2025 Mar;282(3):1593-1607. doi: 10.1007/s00405-024-09153-3. Epub 2025 Jan 10.

Can the large language model ChatGPT-4omni predict outcomes in adult patients with status epilepticus?大语言模型ChatGPT-4omni能否预测成人癫痫持续状态患者的预后？

Epilepsia. 2025 Mar;66(3):674-685. doi: 10.1111/epi.18215. Epub 2024 Dec 26.

Comparing AAOS appropriate use criteria with ChatGPT-4o recommendations on treating distal radius fractures.比较美国矫形外科医师学会（AAOS）关于治疗桡骨远端骨折的恰当使用标准与ChatGPT-4o的相关建议。

Hand Surg Rehabil. 2025 Apr;44(2):102122. doi: 10.1016/j.hansur.2025.102122. Epub 2025 Mar 11.

Patient Support in Obstructive Sleep Apnoea by a Large Language Model - ChatGPT 4o on Answering Frequently Asked Questions on First Line Positive Airway Pressure and Second Line Hypoglossal Nerve Stimulation Therapy: A Pilot Study.大语言模型 ChatGPT 4o 在回答阻塞性睡眠呼吸暂停患者关于一线持续气道正压通气和二线舌下神经刺激疗法常见问题方面的患者支持：一项试点研究

Nat Sci Sleep. 2024 Dec 27;16:2269-2277. doi: 10.2147/NSS.S495654. eCollection 2024.

Comparative analysis of ChatGPT-4o mini, ChatGPT-4o and Gemini Advanced in the treatment of postmenopausal osteoporosis.ChatGPT-4o mini、ChatGPT-4o与Gemini Advanced在绝经后骨质疏松症治疗中的对比分析。

BMC Musculoskelet Disord. 2025 Apr 16;26(1):369. doi: 10.1186/s12891-025-08601-3.

AI-powered standardised patients: evaluating ChatGPT-4o's impact on clinical case management in intern physicians.人工智能驱动的标准化病人：评估ChatGPT-4o对实习医生临床病例管理的影响。

BMC Med Educ. 2025 Feb 20;25(1):278. doi: 10.1186/s12909-025-06877-6.

Assessing the role of advanced artificial intelligence as a tool in multidisciplinary tumor board decision-making for recurrent/metastatic head and neck cancer cases - the first study on ChatGPT 4o and a comparison to ChatGPT 4.0.评估先进人工智能作为一种工具在复发性/转移性头颈癌病例多学科肿瘤委员会决策中的作用——关于ChatGPT 4o的首项研究及与ChatGPT 4.0的比较。

Front Oncol. 2024 Sep 5;14:1455413. doi: 10.3389/fonc.2024.1455413. eCollection 2024.

Assessing the clinical support capabilities of ChatGPT 4o and ChatGPT 4o mini in managing lumbar disc herniation.评估ChatGPT 4o和ChatGPT 4o mini在管理腰椎间盘突出症方面的临床支持能力。

Eur J Med Res. 2025 Jan 22;30(1):45. doi: 10.1186/s40001-025-02296-x.

Assessing the feasibility of ChatGPT-4o and Claude 3-Opus in thyroid nodule classification based on ultrasound images.评估ChatGPT-4o和Claude 3-Opus基于超声图像进行甲状腺结节分类的可行性。

Endocrine. 2025 Mar;87(3):1041-1049. doi: 10.1007/s12020-024-04066-x. Epub 2024 Oct 11.

引用本文的文献

Evaluating Locally Run Large Language Models for Obstructive Sleep Apnea Diagnosis and Treatment: A Real-World Polysomnography Study.评估用于阻塞性睡眠呼吸暂停诊断和治疗的本地运行大语言模型：一项真实世界多导睡眠图研究

Nat Sci Sleep. 2025 Jul 8;17:1587-1599. doi: 10.2147/NSS.S536823. eCollection 2025.

Simulation-Based Evaluation of Large Language Models for Comorbidity Detection in Sleep Medicine - a Pilot Study on ChatGPT o1 Preview.基于模拟的大语言模型在睡眠医学合并症检测中的评估——关于ChatGPT 01预览版的一项初步研究

Nat Sci Sleep. 2025 Apr 29;17:677-688. doi: 10.2147/NSS.S510254. eCollection 2025.

Comparative Analysis of Information Quality in Pediatric Otorhinolaryngology: Clinicians, Residents, and Large Language Models.儿科耳鼻咽喉科学中信息质量的比较分析：临床医生、住院医师和大语言模型

Otolaryngol Head Neck Surg. 2025 Jul;173(1):228-236. doi: 10.1002/ohn.1225. Epub 2025 Mar 19.

本文引用的文献

Assessing the use of the novel tool Claude 3 in comparison to ChatGPT 4.0 as an artificial intelligence tool in the diagnosis and therapy of primary head and neck cancer cases.评估新型工具 Claude 3 与 ChatGPT 4.0 作为原发性头颈部癌症病例诊断和治疗的人工智能工具的使用情况。

Eur Arch Otorhinolaryngol. 2024 Nov;281(11):6099-6109. doi: 10.1007/s00405-024-08828-1. Epub 2024 Aug 7.

Assessing unknown potential-quality and limitations of different large language models in the field of otorhinolaryngology.评估耳鼻喉科领域不同大型语言模型的未知潜在质量和局限性。

Acta Otolaryngol. 2024 Mar;144(3):237-242. doi: 10.1080/00016489.2024.2352843. Epub 2024 May 23.

Ethical and regulatory challenges of large language models in medicine.医学领域大型语言模型的伦理和监管挑战。

Lancet Digit Health. 2024 Jun;6(6):e428-e432. doi: 10.1016/S2589-7500(24)00061-X. Epub 2024 Apr 23.

The Comparative Diagnostic Capability of Large Language Models in Otolaryngology.大语言模型在耳鼻喉科的比较诊断能力

Laryngoscope. 2024 Sep;134(9):3997-4002. doi: 10.1002/lary.31434. Epub 2024 Apr 2.

ChatGPT as an information tool in rhinology. Can we trust each other today?ChatGPT作为鼻科学中的一种信息工具。如今我们能相互信任吗？

Eur Arch Otorhinolaryngol. 2024 Jun;281(6):3253-3259. doi: 10.1007/s00405-024-08581-5. Epub 2024 Mar 4.

Is generative pre-trained transformer artificial intelligence (Chat-GPT) a reliable tool for guidelines synthesis? A preliminary evaluation for biologic CRSwNP therapy.生成式预训练变换器人工智能（Chat-GPT）是用于指南综合的可靠工具吗？生物性慢性鼻-鼻窦炎伴鼻息肉（CRSwNP）治疗的初步评估。

Eur Arch Otorhinolaryngol. 2024 Apr;281(4):2167-2173. doi: 10.1007/s00405-024-08464-9. Epub 2024 Feb 8.

A Novel Evaluation Model for Assessing ChatGPT on Otolaryngology-Head and Neck Surgery Certification Examinations: Performance Study.一种评估 ChatGPT 在耳鼻喉头颈外科认证考试中表现的新评价模型：性能研究。

JMIR Med Educ. 2024 Jan 16;10:e49970. doi: 10.2196/49970.

Accuracy of ChatGPT-3.5 and -4 in providing scientific references in otolaryngology-head and neck surgery.ChatGPT-3.5和-4在提供耳鼻咽喉头颈外科学术参考文献方面的准确性。

Eur Arch Otorhinolaryngol. 2024 Apr;281(4):2159-2165. doi: 10.1007/s00405-023-08441-8. Epub 2024 Jan 11.

ChatGPT Versus Consultants: Blinded Evaluation on Answering Otorhinolaryngology Case-Based Questions.ChatGPT与医学顾问的对比：对耳鼻喉科基于病例问题回答的盲法评估

JMIR Med Educ. 2023 Dec 5;9:e49183. doi: 10.2196/49183.

Accuracy of ChatGPT in head and neck oncological board decisions: preliminary findings.ChatGPT在头颈肿瘤学委员会决策中的准确性：初步研究结果。

Eur Arch Otorhinolaryngol. 2024 Apr;281(4):2105-2114. doi: 10.1007/s00405-023-08326-w. Epub 2023 Nov 22.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

追寻睡眠医学专家：ChatGPT-4o对多导睡眠图结果的解读

Chasing sleep physicians: ChatGPT-4o on the interpretation of polysomnographic results.

作者信息

机构信息

出版信息

BACKGROUND

MATERIAL AND METHODS

RESULTS

CONCLUSION AND SIGNIFICANCE

背景

目的

材料与方法

结果

结论与意义

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献