Pini Nicolò, Ong Ju Lynn, Yilmaz Gizem, Chee Nicholas I Y N, Siting Zhao, Awasthi Animesh, Biju Siddharth, Kishan Kishan, Patanaik Amiya, Fifer William P, Lucchini Maristella
Department of Psychiatry, Columbia University Irving Medical Center, New York, NY, United States.
Division of Developmental Neuroscience, New York State Psychiatric Institute, New York, NY, United States.
Front Neurosci. 2022 Oct 6;16:974192. doi: 10.3389/fnins.2022.974192. eCollection 2022.
The rapid advancement in wearable solutions to monitor and score sleep staging has enabled monitoring outside of the conventional clinical settings. However, most of the devices and algorithms lack extensive and independent validation, a fundamental step to ensure robustness, stability, and replicability of the results beyond the training and testing phases. These systems are thought not to be feasible and reliable alternatives to the gold standard, polysomnography (PSG).
This validation study highlights the accuracy and precision of the proposed heart rate (HR)-based deep-learning algorithm for sleep staging. The illustrated solution can perform classification at 2-levels (Wake; Sleep), 3-levels (Wake; NREM; REM) or 4- levels (Wake; Light; Deep; REM) in 30-s epochs. The algorithm was validated using an open-source dataset of PSG recordings (Physionet CinC dataset, = 994 participants, 994 recordings) and a proprietary dataset of ECG recordings (Z3Pulse, = 52 participants, 112 recordings) collected with a chest-worn, wireless sensor and simultaneous PSG collection using SOMNOtouch.
We evaluated the performance of the models in both datasets in terms of Accuracy (A), Cohen's kappa (K), Sensitivity (SE), Specificity (SP), Positive Predictive Value (PPV), and Negative Predicted Value (NPV). In the CinC dataset, the highest value of accuracy was achieved by the 2-levels model (0.8797), while the 3-levels model obtained the best value of K (0.6025). The 4-levels model obtained the lowest SE (0.3812) and the highest SP (0.9744) for the classification of Deep sleep segments. AHI and biological sex did not affect scoring, while a significant decrease of performance by age was reported across the models. In the Z3Pulse dataset, the highest value of accuracy was achieved by the 2-levels model (0.8812), whereas the 3-levels model obtained the best value of K (0.611). For classification of the sleep states, the lowest SE (0.6163) and the highest SP (0.9606) were obtained for the classification of Deep sleep segment.
The results of the validation procedure demonstrated the feasibility of accurate HR-based sleep staging. The combination of the proposed sleep staging algorithm with an inexpensive HR device, provides a cost-effective and non-invasive solution deployable in the home environment and robust across age, sex, and AHI scores.
用于监测和评估睡眠阶段的可穿戴解决方案的快速发展,使得在传统临床环境之外也能进行监测。然而,大多数设备和算法缺乏广泛且独立的验证,而这是确保结果在训练和测试阶段之外的稳健性、稳定性和可重复性的关键步骤。人们认为这些系统并非金标准多导睡眠图(PSG)可行且可靠的替代方案。
本验证研究突出了所提出的基于心率(HR)的深度学习算法用于睡眠分期的准确性和精确性。所示解决方案能够在30秒的时间段内进行2级(清醒;睡眠)、3级(清醒;非快速眼动睡眠;快速眼动睡眠)或4级(清醒;浅睡眠;深睡眠;快速眼动睡眠)分类。该算法使用PSG记录的开源数据集(Physionet CinC数据集,n = 994名参与者,994份记录)和心电图记录的专有数据集(Z3Pulse,n = 52名参与者,112份记录)进行验证,这些数据是使用胸部佩戴的无线传感器收集的,同时使用SOMNOtouch进行同步PSG采集。
我们从准确率(A)、科恩kappa系数(K)、灵敏度(SE)、特异性(SP)、阳性预测值(PPV)和阴性预测值(NPV)方面评估了两个数据集中模型的性能。在CinC数据集中,2级模型实现了最高的准确率值(0.8797),而3级模型获得了最佳的K值(0.6025)。对于深睡眠段的分类,4级模型的SE最低(0.3812),SP最高(0.9744)。呼吸暂停低通气指数(AHI)和生物性别不影响评分,但各模型均报告随着年龄增长性能显著下降。在Z3Pulse数据集中,2级模型实现了最高的准确率值(0.8812),而3级模型获得了最佳的K值(0.611)。对于睡眠状态的分类,深睡眠段分类的SE最低(0.6163),SP最高(0.9606)。
验证程序的结果证明了基于心率的准确睡眠分期的可行性。所提出的睡眠分期算法与廉价的心率设备相结合,提供了一种具有成本效益的非侵入性解决方案,可在家用环境中部署,并且在年龄、性别和AHI评分方面具有稳健性。