Chen Zihan, Qian Yaojia, Wang Yuxi, Fang Yinfeng
College of Telecommunication, Hangzhou Dianzi University, Hangzhou, China.
Front Bioeng Biotechnol. 2022 Jul 29;10:909653. doi: 10.3389/fbioe.2022.909653. eCollection 2022.
The acquisition of bio-signal from the human body requires a strict experimental setup and ethical approvements, which leads to limited data for the training of classifiers in the era of big data. It will change the situation if synthetic data can be generated based on real data. This article proposes such a kind of multiple channel electromyography (EMG) data enhancement method using a deep convolutional generative adversarial network (DCGAN). The generation procedure is as follows: First, the multiple channels of EMG signals within sliding windows are converted to grayscale images through matrix transformation, normalization, and histogram equalization. Second, the grayscale images of each class are used to train DCGAN so that synthetic grayscale images of each class can be generated with the input of random noises. To evaluate whether the synthetic data own the similarity and diversity with the real data, the classification accuracy index is adopted in this article. A public EMG dataset (that is, ISR Myo-I) for hand motion recognition is used to prove the usability of the proposed method. The experimental results show that adding synthetic data to the training data has little effect on the classification performance, indicating the similarity between real data and synthetic data. Moreover, it is also noted that the average accuracy (five classes) is slightly increased by 1%-2% for support vector machine (SVM) and random forest (RF), respectively, with additional synthetic data for training. Although the improvement is not statistically significant, it implies that the generated data by DCGAN own its new characteristics, and it is possible to enrich the diversity of the training dataset. In addition, cross-validation analysis shows that the synthetic samples have large inter-class distance, reflected by higher cross-validation accuracy of pure synthetic sample classification. Furthermore, this article also demonstrates that histogram equalization can significantly improve the performance of EMG-based hand motion recognition.
从人体获取生物信号需要严格的实验设置和伦理审批,这导致在大数据时代用于训练分类器的数据有限。如果能基于真实数据生成合成数据,情况将会改变。本文提出了一种使用深度卷积生成对抗网络(DCGAN)的多通道肌电图(EMG)数据增强方法。生成过程如下:首先,通过矩阵变换、归一化和直方图均衡化,将滑动窗口内的多通道EMG信号转换为灰度图像。其次,使用每个类别的灰度图像训练DCGAN,以便在输入随机噪声时能够生成每个类别的合成灰度图像。为了评估合成数据是否与真实数据具有相似性和多样性,本文采用了分类准确率指标。使用一个用于手部运动识别的公共EMG数据集(即ISR Myo-I)来证明所提方法的可用性。实验结果表明,在训练数据中添加合成数据对分类性能影响不大,这表明真实数据与合成数据之间具有相似性。此外,还注意到,对于支持向量机(SVM)和随机森林(RF),分别使用额外的合成数据进行训练时,平均准确率(五类)略有提高,分别提高了1%-2%。虽然这种提高在统计上不显著,但这意味着DCGAN生成的数据具有新的特征,并且有可能丰富训练数据集的多样性。此外,交叉验证分析表明,合成样本具有较大的类间距离,这通过纯合成样本分类的较高交叉验证准确率得以体现。此外,本文还证明了直方图均衡化可以显著提高基于EMG的手部运动识别性能。