Department of Electrical and Computer Engineering, Aarhus University, Aarhus N, Denmark.
PLoS One. 2024 Aug 6;19(8):e0307202. doi: 10.1371/journal.pone.0307202. eCollection 2024.
Over the past few years, sleep research has shown impressive performance of deep neural networks in the area of automatic sleep-staging. Recent studies have demonstrated the necessity of combining multiple data sets to obtain sufficiently generalizing results. However, working with large amounts of sleep data can be challenging, both from a hardware perspective and because of the different preprocessing steps necessary for distinct data sources. Here we review the possible obstacles and present an open-source pipeline for automatic data loading. Our solution includes both a standardized data store as well as a 'data serving' portion which can be used to train neural networks on the standardized data, allowing for different configuration options for different studies and machine learning designs. The pipeline, including implementation, is made public to ensure better and more reproducible sleep research.
在过去的几年中,睡眠研究表明深度神经网络在自动睡眠分期领域表现出色。最近的研究表明,需要结合多个数据集才能获得具有足够泛化能力的结果。然而,处理大量睡眠数据既具有挑战性,又从硬件角度来看,也因为不同数据源所需的不同预处理步骤。在这里,我们回顾了可能存在的障碍,并提出了一个用于自动数据加载的开源管道。我们的解决方案包括一个标准化的数据存储以及一个“数据服务”部分,可用于在标准化数据上训练神经网络,从而为不同的研究和机器学习设计提供不同的配置选项。该管道(包括实现)是公开的,以确保更好和更可重复的睡眠研究。