Université Paris-Saclay, Inria, CEA, Palaiseau 91120, France.
J Neural Eng. 2022 Nov 28;19(6). doi: 10.1088/1741-2552/aca220.
The use of deep learning for electroencephalography (EEG) classification tasks has been rapidly growing in the last years, yet its application has been limited by the relatively small size of EEG datasets. Data augmentation, which consists in artificially increasing the size of the dataset during training, can be employed to alleviate this problem. While a few augmentation transformations for EEG data have been proposed in the literature, their positive impact on performance is often evaluated on a single dataset and compared to one or two competing augmentation methods. This work proposes to better validate the existing data augmentation approaches through a unified and exhaustive analysis.We compare quantitatively 13 different augmentations with two different predictive tasks, datasets and models, using three different types of experiments.We demonstrate that employing the adequate data augmentations can bring up to 45% accuracy improvements in low data regimes compared to the same model trained without any augmentation. Our experiments also show that there is no single best augmentation strategy, as the good augmentations differ on each task.Our results highlight the best data augmentations to consider for sleep stage classification and motor imagery brain-computer interfaces. More broadly, it demonstrates that EEG classification tasks benefit from adequate data augmentation.
近年来,深度学习在脑电图(EEG)分类任务中的应用迅速发展,但由于 EEG 数据集相对较小,其应用受到限制。数据增强是指在训练过程中人为地增加数据集的大小,可以用来缓解这个问题。虽然文献中已经提出了几种 EEG 数据的增强转换方法,但它们对性能的积极影响通常在单个数据集上进行评估,并与一两个竞争的增强方法进行比较。这项工作通过统一和详尽的分析来更好地验证现有的数据增强方法。我们使用三种不同的实验,在两个不同的预测任务、数据集和模型中,对 13 种不同的增强方法进行定量比较。我们证明,与未经任何增强训练的相同模型相比,在数据量较少的情况下,采用适当的数据增强可以将准确率提高高达 45%。我们的实验还表明,不存在单一的最佳增强策略,因为不同任务的好的增强方法不同。我们的结果突出了睡眠阶段分类和运动想象脑机接口的最佳数据增强方法。更广泛地说,它证明了 EEG 分类任务受益于适当的数据增强。