多中心睡眠分期评分在睡眠革命项目中的一致性。

Multicentre sleep-stage scoring agreement in the Sleep Revolution project.

机构信息

Department of Technical Physics, University of Eastern Finland, Kuopio, Finland.

Diagnostic Imaging Center, Kuopio University Hospital, Kuopio, Finland.

出版信息

J Sleep Res. 2024 Feb;33(1):e13956. doi: 10.1111/jsr.13956. Epub 2023 Jun 13.

DOI:10.1111/jsr.13956

PMID:37309714

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10909532/

Abstract

Determining sleep stages accurately is an important part of the diagnostic process for numerous sleep disorders. However, as the sleep stage scoring is done manually following visual scoring rules there can be considerable variation in the sleep staging between different scorers. Thus, this study aimed to comprehensively evaluate the inter-rater agreement in sleep staging. A total of 50 polysomnography recordings were manually scored by 10 independent scorers from seven different sleep centres. We used the 10 scorings to calculate a majority score by taking the sleep stage that was the most scored stage for each epoch. The overall agreement for sleep staging was κ = 0.71 and the mean agreement with the majority score was 0.86. The scorers were in perfect agreement in 48% of all scored epochs. The agreement was highest in rapid eye movement sleep (κ = 0.86) and lowest in N1 sleep (κ = 0.41). The agreement with the majority scoring varied between the scorers from 81% to 91%, with large variations between the scorers in sleep stage-specific agreements. Scorers from the same sleep centres had the highest pairwise agreements at κ = 0.79, κ = 0.85, and κ = 0.78, while the lowest pairwise agreement between the scorers was κ = 0.58. We also found a moderate negative correlation between sleep staging agreement and the apnea-hypopnea index, as well as the rate of sleep stage transitions. In conclusion, although the overall agreement was high, several areas of low agreement were also found, mainly between non-rapid eye movement stages.

摘要

准确确定睡眠阶段是许多睡眠障碍诊断过程的重要组成部分。然而，由于睡眠阶段的评分是根据视觉评分规则手动完成的，因此不同评分者之间的睡眠分期可能存在相当大的差异。因此，本研究旨在全面评估睡眠分期的评分者间一致性。总共 50 个多导睡眠图记录由来自七个不同睡眠中心的 10 位独立评分者手动评分。我们使用这 10 个评分来计算多数评分，方法是为每个时段选择得分最多的睡眠阶段。睡眠分期的总体一致性为 κ=0.71，与多数评分的平均一致性为 0.86。在所有评分时段中，评分者有 48%的一致性达到完美。在快速眼动睡眠（κ=0.86）中，一致性最高，在 N1 睡眠（κ=0.41）中，一致性最低。评分者与多数评分的一致性在 81%到 91%之间变化，在特定睡眠阶段的一致性方面存在较大差异。来自同一睡眠中心的评分者之间的配对一致性最高，为 κ=0.79、κ=0.85 和 κ=0.78，而评分者之间的最低配对一致性为 κ=0.58。我们还发现，睡眠分期的一致性与呼吸暂停-低通气指数以及睡眠阶段转换率之间存在中度负相关。总之，尽管总体一致性较高，但也发现了一些一致性较低的领域，主要是在非快速眼动阶段之间。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05c7/10909532/f492200ac2a4/JSR-33-e13956-g001.jpg

相似文献

Multicentre sleep-stage scoring agreement in the Sleep Revolution project.

J Sleep Res. 2024 Feb;33(1):e13956. doi: 10.1111/jsr.13956. Epub 2023 Jun 13.

The American Academy of Sleep Medicine Inter-scorer Reliability program: respiratory events.

J Clin Sleep Med. 2014 Apr 15;10(4):447-54. doi: 10.5664/jcsm.3630.

Polysomnography scoring-related training and quantitative assessment for improving interscorer agreement.

J Clin Sleep Med. 2024 Feb 1;20(2):271-278. doi: 10.5664/jcsm.10852.

Assessment of automated scoring of polysomnographic recordings in a population with suspected sleep-disordered breathing.

Sleep. 2004 Nov 1;27(7):1394-403. doi: 10.1093/sleep/27.7.1394.

Process and outcome for international reliability in sleep scoring.

Sleep Breath. 2015 Mar;19(1):191-5. doi: 10.1007/s11325-014-0990-0. Epub 2014 May 7.

Interobserver agreement among sleep scorers from different centers in a large dataset.

Sleep. 2000 Nov 1;23(7):901-8.

The American Academy of Sleep Medicine inter-scorer reliability program: sleep stage scoring.

J Clin Sleep Med. 2013 Jan 15;9(1):81-7. doi: 10.5664/jcsm.2350.

Interrater sleep stage scoring reliability between manual scoring from two European sleep centers and automatic scoring performed by the artificial intelligence-based Stanford-STAGES algorithm.

J Clin Sleep Med. 2021 Jun 1;17(6):1237-1247. doi: 10.5664/jcsm.9174.

Agreement in the scoring of respiratory events and sleep among international sleep centers.

Sleep. 2013 Apr 1;36(4):591-6. doi: 10.5665/sleep.2552.

Multi-centre arousal scoring agreement in the Sleep Revolution.

J Sleep Res. 2024 Aug;33(4):e14127. doi: 10.1111/jsr.14127. Epub 2023 Dec 26.

引用本文的文献

Beyond accuracy: a framework for evaluating algorithmic bias and performance, applied to automated sleep scoring.

Sci Rep. 2025 Jul 1;15(1):21421. doi: 10.1038/s41598-025-06019-4.

Longitudinal, EEG-based assessment of sleep in people with epilepsy: An automated sleep staging algorithm non-inferior to human raters.

Clin Neurophysiol Pract. 2025 Jan 27;10:30-39. doi: 10.1016/j.cnp.2025.01.001. eCollection 2025.

Temporal and sleep stage-dependent agreement in manual scoring of respiratory events.

J Sleep Res. 2025 Jun;34(3):e14391. doi: 10.1111/jsr.14391. Epub 2024 Nov 4.

Retrospective validation of automatic sleep analysis with grey areas model for human-in-the-loop scoring approach.

J Sleep Res. 2025 Jun;34(3):e14362. doi: 10.1111/jsr.14362. Epub 2024 Oct 23.

Arousal burden is highest in supine sleeping position and during light sleep.

J Clin Sleep Med. 2025 Feb 1;21(2):337-344. doi: 10.5664/jcsm.11398.

Common sleep data pipeline for combined data sets.

PLoS One. 2024 Aug 6;19(8):e0307202. doi: 10.1371/journal.pone.0307202. eCollection 2024.

An optimized framework for processing multicentric polysomnographic data incorporating expert human oversight.

Front Neuroinform. 2024 May 13;18:1379932. doi: 10.3389/fninf.2024.1379932. eCollection 2024.

Sleep stage continuity is associated with objective daytime sleepiness in patients with suspected obstructive sleep apnea.

J Clin Sleep Med. 2024 Oct 1;20(10):1595-1606. doi: 10.5664/jcsm.11198.

本文引用的文献

STAR sleep recording export software for automatic export and anonymization of sleep studies.

Sci Rep. 2022 Sep 23;12(1):15859. doi: 10.1038/s41598-022-19892-0.

Scoring sleep with artificial intelligence enables quantification of sleep stage ambiguity: hypnodensity based on multiple expert scorers and auto-scoring.

Sleep. 2023 Feb 8;46(2). doi: 10.1093/sleep/zsac154.

The Sleep Revolution project: the concept and objectives.

J Sleep Res. 2022 Aug;31(4):e13630. doi: 10.1111/jsr.13630. Epub 2022 Jun 30.

Deep learning enables sleep staging from photoplethysmogram for patients with suspected sleep apnea.

Sleep. 2020 Nov 12;43(11). doi: 10.1093/sleep/zsaa098.

Accurate Deep Learning-Based Sleep Staging in a Clinical Population With Suspected Obstructive Sleep Apnea.

IEEE J Biomed Health Inform. 2020 Jul;24(7):2073-2081. doi: 10.1109/JBHI.2019.2951346. Epub 2019 Dec 19.

Staging Sleep in Polysomnograms: Analysis of Inter-Scorer Variability.

J Clin Sleep Med. 2016 Jun 15;12(6):885-94. doi: 10.5664/jcsm.5894.

STOP-Bang Questionnaire: A Practical Approach to Screen for Obstructive Sleep Apnea.

Chest. 2016 Mar;149(3):631-8. doi: 10.1378/chest.15-0903. Epub 2016 Jan 12.

Process and outcome for international reliability in sleep scoring.

Sleep Breath. 2015 Mar;19(1):191-5. doi: 10.1007/s11325-014-0990-0. Epub 2014 May 7.

Agreement in the scoring of respiratory events and sleep among international sleep centers.

Sleep. 2013 Apr 1;36(4):591-6. doi: 10.5665/sleep.2552.

Inter-scorer reliability between sleep centers can teach us what to improve in the scoring rules.

J Clin Sleep Med. 2013 Jan 15;9(1):89-91. doi: 10.5664/jcsm.2352.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

多中心睡眠分期评分在睡眠革命项目中的一致性。

Multicentre sleep-stage scoring agreement in the Sleep Revolution project.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献