Liang Zilu
Ubiquitous and Personal Computing Lab, Kyoto University of Advanced Science (KUAS), 18 Yamanouchi Gotanda-cho, Ukyo-ku, Kyoto, Japan.
Sleep Breath. 2024 Dec;28(6):2409-2420. doi: 10.1007/s11325-024-03141-x. Epub 2024 Aug 27.
This study aims to develop sleep apnea screening models with overnight SpO2 data, and to investigate the impact of the SpO2 data granularity on model performance.
A total of 7,718 SpO2 recordings from the SHHS and MESA datasets were used. Probabilistic ensemble machine learning was employed to predict sleep apnea status at three AHI cutoff points: ≥ 5, ≥ 15, and ≥ 30 events/hour. To investigate the impact of data granularity, SpO2 data were aggregated at 30, 60, and 300 s.
Our models demonstrated good to excellent performance on internal test, with average area under the curve (AUC) values of 0.91, 0.93, and 0.96 for cutoffs ≥ 5, ≥ 15, and ≥ 30 at data granularity of 1 s, respectively. Both sensitivity (0.76, 0.84, 0.89) and specificity (0.87, 0.86, 0.90) ranged from good to excellent across three cutoffs. Positive predictive values (PPV) ranged from excellent to fair (0.97, 0.83, 0.66), and negative predictive values (NPV) ranged from low to excellent (0.43, 0.87, 0.98). Model performance on external test slightly dropped compared to internal test, but still achieved good to excellent AUC above 0.80 across all data granularity and all the three cutoffs. Data granularity of 300 s led to a reduction in performance metrics across all cutoffs.
Our models demonstrated superior performance across all three AHI cutoff thresholds compared to existing large sleep apnea screening models, even when considering varying SpO2 data granularity. However, lower data granularity was associated with decreased screening performance, indicating a need for further research in this area.
本研究旨在利用夜间SpO₂数据开发睡眠呼吸暂停筛查模型,并研究SpO₂数据粒度对模型性能的影响。
使用了来自SHHS和MESA数据集的总共7718份SpO₂记录。采用概率集成机器学习在三个呼吸暂停低通气指数(AHI)临界点预测睡眠呼吸暂停状态:≥5、≥15和≥30次/小时。为了研究数据粒度的影响,SpO₂数据按30秒、60秒和300秒进行汇总。
我们的模型在内部测试中表现良好至优异,在数据粒度为1秒时,对于临界点≥5、≥15和≥30,曲线下面积(AUC)的平均值分别为0.91、0.93和0.96。在三个临界点上,敏感性(0.76、0.84、0.89)和特异性(0.87、0.86、0.90)均从良好到优异。阳性预测值(PPV)从优异到一般(0.97、0.83、0.66),阴性预测值(NPV)从低到优异(0.43、0.87、0.98)。与内部测试相比,外部测试中的模型性能略有下降,但在所有数据粒度和所有三个临界点上仍实现了高于0.80的良好至优异的AUC。300秒的数据粒度导致所有临界点的性能指标下降。
与现有的大型睡眠呼吸暂停筛查模型相比,我们的模型在所有三个AHI临界点上均表现出卓越的性能,即使考虑到不同的SpO₂数据粒度。然而,较低的数据粒度与筛查性能下降相关,表明该领域需要进一步研究。