Lee Hyojin, Choi You Rim, Lee Hyun Kyung, Jeong Jaemin, Hong Joopyo, Shin Hyun-Woo, Kim Hyung-Sin
Graduate School of Data Science, Seoul National University, Seoul, Republic of Korea.
Obstructive Upper Airway Research (OUaR) Laboratory, Department of Pharmacology, Seoul National University College of Medicine, Seoul, Republic of Korea.
NPJ Digit Med. 2025 Jan 25;8(1):55. doi: 10.1038/s41746-024-01378-0.
Polysomnography (PSG) is crucial for diagnosing sleep disorders, but manual scoring of PSG is time-consuming and subjective, leading to high variability. While machine-learning models have improved PSG scoring, their clinical use is hindered by the 'black-box' nature. In this study, we present SleepXViT, an automatic sleep staging system using Vision Transformer (ViT) that provides intuitive, consistent explanations by mimicking human 'visual scoring'. Tested on KISS-a PSG image dataset from 7745 patients across four hospitals-SleepXViT achieved a Macro F1 score of 81.94%, outperforming baseline models and showing robust performances on public datasets SHHS1 and SHHS2. Furthermore, SleepXViT offers well-calibrated confidence scores, enabling expert review for low-confidence predictions, alongside high-resolution heatmaps highlighting essential features and relevance scores for adjacent epochs' influence on sleep stage predictions. Together, these explanations reinforce the scoring consistency of SleepXViT, making it both reliable and interpretable, thereby facilitating the synergy between the AI model and human scorers in clinical settings.
多导睡眠图(PSG)对于诊断睡眠障碍至关重要,但PSG的人工评分既耗时又主观,导致差异很大。虽然机器学习模型改进了PSG评分,但其临床应用因“黑箱”性质而受到阻碍。在本研究中,我们展示了SleepXViT,这是一种使用视觉Transformer(ViT)的自动睡眠分期系统,通过模仿人类的“视觉评分”提供直观、一致的解释。在来自四家医院的7745名患者的KISS-PSG图像数据集上进行测试时,SleepXViT的宏F1分数达到81.94%,优于基线模型,并在公共数据集SHHS1和SHHS2上表现出稳健的性能。此外,SleepXViT提供了校准良好的置信度分数,允许对低置信度预测进行专家审查,同时还有高分辨率热图突出显示基本特征以及相邻时段对睡眠阶段预测影响的相关性分数。这些解释共同增强了SleepXViT的评分一致性,使其既可靠又可解释,从而促进了人工智能模型与临床环境中人工评分者之间的协同作用。