Haimov Sharon, Tabakhov Alissa, Tauman Riva, Behar Joachim A
IEEE Trans Biomed Eng. 2025 Feb;72(2):760-767. doi: 10.1109/TBME.2024.3470534. Epub 2025 Jan 21.
Sleep staging is critical for diagnosing sleep disorders. Traditional methods in clinical settings involve time-intensive scoring procedures. Recent advancements in data-driven algorithms using photoplethysmogram (PPG) time series have shown promise in automating sleep staging in adults. However, for children, algorithm development is hindered by the limited availability of datasets, with the Childhood Adenotonsillectomy Trial (CHAT) being the only substantial source, comprising recordings from children aged 5-10. This limitation constrains the evaluation of algorithmic generalization performance.
We employed a deep learning model for sleep staging from PPG, initially trained using a large dataset of adult sleep recordings, and fine-tuned it on 80% of the CHAT dataset (CHAT-train) for the task of three-class sleep staging (wake, REM, non-REM). The resulting algorithm performance was compared to the same model architecture but trained from scratch on CHAT-train (benchmark). The algorithms are evaluated on the local test set, denoted CHAT-test, as well as on a newly introduced independent dataset.
Our deep learning algorithm achieved a Cohen's Kappa of 0.88 on CHAT-test (versus 0.65), and demonstrated generalization capabilities with a Kappa of 0.72 on the external Ichilov dataset for children above 5 years old (versus 0.64) and 0.64 for those below 5 (versus 0.53).
This research establishes a new state-of-the-art performance for the task of sleep staging in children using raw PPG. The findings underscore the value of transfer learning from the adults to children domain. However, the reduced performance in children under 5 suggests the need for further research and additional datasets covering a broader pediatric age range to fully address generalization limitations.
睡眠分期对于诊断睡眠障碍至关重要。临床环境中的传统方法涉及耗时的评分程序。利用光电容积脉搏波描记图(PPG)时间序列的数据驱动算法的最新进展已显示出在成人睡眠分期自动化方面的前景。然而,对于儿童,算法开发受到数据集可用性有限的阻碍,儿童腺样体扁桃体切除术试验(CHAT)是唯一的重要来源,包含5至10岁儿童的记录。这一限制制约了算法泛化性能的评估。
我们采用了一种用于从PPG进行睡眠分期的深度学习模型,最初使用大量成人睡眠记录数据集进行训练,并在80%的CHAT数据集(CHAT-train)上进行微调,以完成三类睡眠分期(清醒、快速眼动、非快速眼动)任务。将所得算法性能与相同模型架构但在CHAT-train上从头开始训练的算法(基准)进行比较。这些算法在本地测试集(称为CHAT-test)以及新引入的独立数据集上进行评估。
我们的深度学习算法在CHAT-test上的科恩卡方系数为0.88(相比之下基准为0.65),并在5岁以上儿童的外部伊奇洛夫数据集上展示了泛化能力,卡方系数为0.72(相比之下基准为0.64),5岁以下儿童的卡方系数为0.64(相比之下基准为0.53)。
本研究为使用原始PPG进行儿童睡眠分期任务建立了新的最先进性能。研究结果强调了从成人领域向儿童领域迁移学习的价值。然而,5岁以下儿童性能的下降表明需要进一步研究以及更多涵盖更广泛儿科年龄范围的数据集,以充分解决泛化限制问题。