Department of Technical Physics, University of Eastern Finland, Kuopio, Finland.
Diagnostic Imaging Center, Kuopio University Hospital, Kuopio, Finland.
J Sleep Res. 2024 Feb;33(1):e13956. doi: 10.1111/jsr.13956. Epub 2023 Jun 13.
Determining sleep stages accurately is an important part of the diagnostic process for numerous sleep disorders. However, as the sleep stage scoring is done manually following visual scoring rules there can be considerable variation in the sleep staging between different scorers. Thus, this study aimed to comprehensively evaluate the inter-rater agreement in sleep staging. A total of 50 polysomnography recordings were manually scored by 10 independent scorers from seven different sleep centres. We used the 10 scorings to calculate a majority score by taking the sleep stage that was the most scored stage for each epoch. The overall agreement for sleep staging was κ = 0.71 and the mean agreement with the majority score was 0.86. The scorers were in perfect agreement in 48% of all scored epochs. The agreement was highest in rapid eye movement sleep (κ = 0.86) and lowest in N1 sleep (κ = 0.41). The agreement with the majority scoring varied between the scorers from 81% to 91%, with large variations between the scorers in sleep stage-specific agreements. Scorers from the same sleep centres had the highest pairwise agreements at κ = 0.79, κ = 0.85, and κ = 0.78, while the lowest pairwise agreement between the scorers was κ = 0.58. We also found a moderate negative correlation between sleep staging agreement and the apnea-hypopnea index, as well as the rate of sleep stage transitions. In conclusion, although the overall agreement was high, several areas of low agreement were also found, mainly between non-rapid eye movement stages.
准确确定睡眠阶段是许多睡眠障碍诊断过程的重要组成部分。然而,由于睡眠阶段的评分是根据视觉评分规则手动完成的,因此不同评分者之间的睡眠分期可能存在相当大的差异。因此,本研究旨在全面评估睡眠分期的评分者间一致性。总共 50 个多导睡眠图记录由来自七个不同睡眠中心的 10 位独立评分者手动评分。我们使用这 10 个评分来计算多数评分,方法是为每个时段选择得分最多的睡眠阶段。睡眠分期的总体一致性为 κ=0.71,与多数评分的平均一致性为 0.86。在所有评分时段中,评分者有 48%的一致性达到完美。在快速眼动睡眠(κ=0.86)中,一致性最高,在 N1 睡眠(κ=0.41)中,一致性最低。评分者与多数评分的一致性在 81%到 91%之间变化,在特定睡眠阶段的一致性方面存在较大差异。来自同一睡眠中心的评分者之间的配对一致性最高,为 κ=0.79、κ=0.85 和 κ=0.78,而评分者之间的最低配对一致性为 κ=0.58。我们还发现,睡眠分期的一致性与呼吸暂停-低通气指数以及睡眠阶段转换率之间存在中度负相关。总之,尽管总体一致性较高,但也发现了一些一致性较低的领域,主要是在非快速眼动阶段之间。