Danker-Hopfe Heidi, Anderer Peter, Zeitlhofer Josef, Boeck Marion, Dorn Hans, Gruber Georg, Heller Esther, Loretz Erna, Moser Doris, Parapatics Silvia, Saletu Bernd, Schmidt Andrea, Dorffner Georg
Department of Psychiatry and Psychotherapy, Charité-Universitätsmedizin Berlin, Berlin, Germany.
J Sleep Res. 2009 Mar;18(1):74-84. doi: 10.1111/j.1365-2869.2008.00700.x.
Interrater variability of sleep stage scorings has an essential impact not only on the reading of polysomnographic sleep studies (PSGs) for clinical trials but also on the evaluation of patients' sleep. With the introduction of a new standard for sleep stage scorings (AASM standard) there is a need for studies on interrater reliability (IRR). The SIESTA database resulting from an EU-funded project provides a large number of studies (n = 72; 56 healthy controls and 16 subjects with different sleep disorders, mean age +/- SD: 57.7 +/- 18.7, 34 females) for which scorings according to both standards (AASM and R&K) were done. Differences in IRR were analysed at two levels: (1) based on quantitative sleep parameter by means of intraclass correlations; and (2) based on an epoch-by-epoch comparison by means of Cohen's kappa and Fleiss' kappa. The overall agreement was for the AASM standard 82.0% (Cohen's kappa = 0.76) and for the R&K standard 80.6% (Cohen's kappa = 0.68). Agreements increased from R&K to AASM for all sleep stages, except N2. The results of this study underline that the modification of the scoring rules improve IRR as a result of the integration of occipital, central and frontal leads on the one hand, but decline IRR on the other hand specifically for N2, due to the new rule that cortical arousals with or without concurrent increase in submental electromyogram are critical events for the end of N2.
睡眠阶段评分的评分者间变异性不仅对临床试验的多导睡眠图睡眠研究(PSG)解读有重要影响,而且对患者睡眠评估也有重要影响。随着睡眠阶段评分新标准(AASM标准)的引入,有必要开展评分者间可靠性(IRR)研究。一个由欧盟资助项目产生的SIESTA数据库提供了大量研究(n = 72;56名健康对照者和16名患有不同睡眠障碍的受试者,平均年龄±标准差:57.7±18.7,34名女性),这些研究均按照两种标准(AASM和R&K)进行了评分。在两个层面分析了IRR的差异:(1)基于类内相关性的定量睡眠参数;(2)基于Cohen卡方和Fleiss卡方的逐段比较。AASM标准的总体一致性为82.0%(Cohen卡方 = 0.76),R&K标准的总体一致性为80.6%(Cohen卡方 = 0.68)。除N2阶段外,所有睡眠阶段从R&K到AASM的一致性均有所增加。本研究结果强调,评分规则的修改一方面由于整合了枕部、中央和额部导联而提高了IRR,但另一方面由于新规则规定伴有或不伴有颏下肌电图同时增加的皮层觉醒是N2结束的关键事件,导致N2阶段的IRR下降。