Philips Sleep and Respiratory Care, Vienna, Austria.
The Siesta Group Schlafanalyse GmbH, Vienna, Austria.
Adv Exp Med Biol. 2022;1384:107-130. doi: 10.1007/978-3-031-06413-5_7.
Conventionally, sleep and associated events are scored visually by trained technologists according to the rules summarized in the American Academy of Sleep Medicine Manual. Since its first publication in 2007, the manual was continuously updated; the most recent version as of this writing was published in 2020. Human expert scoring is considered as gold standard, even though there is increasing evidence of limited interrater reliability between human scorers. Significant advances in machine learning have resulted in powerful methods for addressing complex classification problems such as automated scoring of sleep and associated events. Evidence is increasing that these autoscoring systems deliver performance comparable to manual scoring and offer several advantages to visual scoring: (1) avoidance of the rather expensive, time-consuming, and difficult visual scoring task that can be performed only by well-trained and experienced human scorers, (2) attainment of consistent scoring results, and (3) proposition of added value such as scoring in real time, sleep stage probabilities per epoch (hypnodensity), estimates of signal quality and sleep/wake-related features, identifications of periods with clinically relevant ambiguities (confidence trends), configurable sensitivity and rule settings, as well as cardiorespiratory sleep staging for home sleep apnea testing. This chapter describes the development of autoscoring systems since the first attempts in the 1970s up to the most recent solutions based on deep neural network approaches which achieve an accuracy that allows to use the autoscoring results directly for review and interpretation by a physician.
传统上,睡眠和相关事件是由经过训练的技术人员根据美国睡眠医学学会手册中总结的规则进行视觉评分的。自 2007 年首次出版以来,该手册不断更新;截至本文撰写之时,最新版本发布于 2020 年。尽管人类评分员之间的可靠性存在局限性,但人类专家评分仍被认为是金标准。机器学习的重大进展已经产生了强大的方法,可以解决复杂的分类问题,例如自动评分睡眠和相关事件。越来越多的证据表明,这些自动评分系统的性能可与手动评分相媲美,并为视觉评分带来了几个优势:(1)避免了昂贵、耗时且困难的视觉评分任务,而该任务只能由经过良好培训和经验丰富的人类评分员执行,(2)实现一致的评分结果,以及(3)提供实时评分、每个时段的睡眠阶段概率(催眠密度)、信号质量和与睡眠/唤醒相关的特征估计、具有临床相关歧义的时期识别(置信度趋势)、可配置的敏感性和规则设置以及家庭睡眠呼吸暂停测试的心肺睡眠分期等附加价值。本章描述了自 20 世纪 70 年代首次尝试以来自动评分系统的发展,直到最近基于深度神经网络方法的解决方案,这些方法的准确性允许直接使用自动评分结果供医生进行审查和解释。