Seifen Christopher, Bahr-Hamm Katharina, Gouveris Haralampos, Pordzik Johannes, Blaikie Andrew, Matthias Christoph, Kuhn Sebastian, Buhr Christoph Raphael
Sleep Medicine Center & Department of Otolaryngology, Head and Neck Surgery, University Medical Center Mainz, Mainz, Germany.
School of Medicine, University of St Andrews, St Andrews, UK.
Nat Sci Sleep. 2025 Apr 29;17:677-688. doi: 10.2147/NSS.S510254. eCollection 2025.
Timely identification of comorbidities is critical in sleep medicine, where large language models (LLMs) like ChatGPT are currently emerging as transformative tools. Here, we investigate whether the novel LLM ChatGPT o1 preview can identify individual health risks or potentially existing comorbidities from the medical data of fictitious sleep medicine patients.
We conducted a simulation-based study using 30 fictitious patients, designed to represent realistic variations in demographic and clinical parameters commonly seen in sleep medicine. Each profile included personal data (eg, body mass index, smoking status, drinking habits), blood pressure, and routine blood test results, along with a predefined sleep medicine diagnosis. Each patient profile was evaluated independently by the LLM and a sleep medicine specialist (SMS) for identification of potential comorbidities or individual health risks. Their recommendations were compared for concordance across lifestyle changes and further medical measures.
The LLM achieved high concordance with the SMS for lifestyle modification recommendations, including 100% concordance on smoking cessation (κ = 1; p < 0.001), 97% on alcohol reduction (κ = 0.92; p < 0.001) and endocrinological examination (κ = 0.92; p < 0.001) or 93% on weight loss (κ = 0.86; p < 0.001). However, it exhibited a tendency to over-recommend further medical measures (particularly 57% concordance for cardiological examination (κ = 0.08; p = 0.28) and 33% for gastrointestinal examination (κ = 0.1; p = 0.22)) compared to the SMS.
Despite the obvious limitation of using fictitious data, the findings suggest that LLMs like ChatGPT have the potential to complement clinical workflows in sleep medicine by identifying individual health risks and comorbidities. As LLMs continue to evolve, their integration into healthcare could redefine the approach to patient evaluation and risk stratification. Future research should contextualize the findings within broader clinical applications ideally testing locally run LLMs meeting data protection requirements.
及时识别合并症在睡眠医学中至关重要,目前像ChatGPT这样的大语言模型正作为变革性工具崭露头角。在此,我们研究新型大语言模型ChatGPT 01预览版能否从虚拟睡眠医学患者的医疗数据中识别个体健康风险或潜在的现有合并症。
我们使用30名虚拟患者进行了一项基于模拟的研究,旨在呈现睡眠医学中常见的人口统计学和临床参数的实际变化。每个病例包括个人数据(如体重指数、吸烟状况、饮酒习惯)、血压、常规血液检查结果以及预定义的睡眠医学诊断。每个患者病例由大语言模型和睡眠医学专家(SMS)独立评估,以识别潜在的合并症或个体健康风险。比较他们在生活方式改变和进一步医疗措施方面的建议一致性。
大语言模型在生活方式改变建议方面与睡眠医学专家达成了高度一致,包括戒烟方面一致性为100%(κ = 1;p < 0.001),减少饮酒和内分泌检查方面为97%(κ = 0.92;p < 0.001),或减肥方面为93%(κ = 0.86;p < 0.001)。然而,与睡眠医学专家相比,它表现出过度推荐进一步医疗措施的倾向(特别是心脏检查方面一致性为57%(κ = 0.08;p = 0.28),胃肠道检查方面为33%(κ = 0.1;p = 0.22))。
尽管使用虚拟数据存在明显局限性,但研究结果表明,像ChatGPT这样的大语言模型有潜力通过识别个体健康风险和合并症来补充睡眠医学的临床工作流程。随着大语言模型不断发展,将其整合到医疗保健中可能会重新定义患者评估和风险分层的方法。未来的研究应将这些发现置于更广泛的临床应用背景下,理想情况下测试符合数据保护要求的本地运行大语言模型。