School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, USA.
Neurology Department, Massachusetts General Hospital, Wang 720, Boston, MA, USA.
J Am Med Inform Assoc. 2018 Dec 1;25(12):1643-1650. doi: 10.1093/jamia/ocy131.
Scoring laboratory polysomnography (PSG) data remains a manual task of visually annotating 3 primary categories: sleep stages, sleep disordered breathing, and limb movements. Attempts to automate this process have been hampered by the complexity of PSG signals and physiological heterogeneity between patients. Deep neural networks, which have recently achieved expert-level performance for other complex medical tasks, are ideally suited to PSG scoring, given sufficient training data.
We used a combination of deep recurrent and convolutional neural networks (RCNN) for supervised learning of clinical labels designating sleep stages, sleep apnea events, and limb movements. The data for testing and training were derived from 10 000 clinical PSGs and 5804 research PSGs.
When trained on the clinical dataset, the RCNN reproduces PSG diagnostic scoring for sleep staging, sleep apnea, and limb movements with accuracies of 87.6%, 88.2% and 84.7% on held-out test data, a level of performance comparable to human experts. The RCNN model performs equally well when tested on the independent research PSG database. Only small reductions in accuracy were noted when training on limited channels to mimic at-home monitoring devices: frontal leads only for sleep staging, and thoracic belt signals only for the apnea-hypopnea index.
By creating accurate deep learning models for sleep scoring, our work opens the path toward broader and more timely access to sleep diagnostics. Accurate scoring automation can improve the utility and efficiency of in-lab and at-home approaches to sleep diagnostics, potentially extending the reach of sleep expertise beyond specialty clinics.
评分实验室多导睡眠图(PSG)数据仍然是一项手动任务,需要通过视觉注释 3 个主要类别:睡眠阶段、睡眠呼吸障碍和肢体运动。由于 PSG 信号的复杂性以及患者之间的生理异质性,尝试自动完成此过程一直受到阻碍。深度神经网络最近在其他复杂的医学任务中达到了专家级的性能,非常适合 PSG 评分,只要有足够的训练数据。
我们使用深度递归和卷积神经网络(RCNN)的组合进行监督学习,以学习临床标签,这些标签指定睡眠阶段、睡眠呼吸暂停事件和肢体运动。用于测试和训练的数据来自 10000 次临床 PSG 和 5804 次研究 PSG。
在临床数据集上进行训练时,RCNN 对睡眠分期、睡眠呼吸暂停和肢体运动的 PSG 诊断评分进行了复制,在保留的测试数据上的准确率分别为 87.6%、88.2%和 84.7%,性能与人类专家相当。当在独立的研究 PSG 数据库上进行测试时,RCNN 模型的表现同样出色。当仅使用模拟家庭监测设备的有限通道进行训练时,仅对睡眠分期使用额导联,仅对呼吸暂停低通气指数使用胸带信号,仅观察到准确性略有下降。
通过创建用于睡眠评分的准确深度学习模型,我们的工作为更广泛和更及时地获得睡眠诊断开辟了道路。准确的评分自动化可以提高实验室和家庭睡眠诊断方法的实用性和效率,有可能将睡眠专业知识的覆盖范围扩展到专业诊所之外。