Wand Michael, Schultz Tanja
Annu Int Conf IEEE Eng Med Biol Soc. 2014;2014:4200-3. doi: 10.1109/EMBC.2014.6944550.
We report on classification of phones and phonetic features from facial electromyographic (EMG) data, within the context of our EMG-based Silent Speech interface. In this paper we show that a Deep Neural Network can be used to perform this classification task, yielding a significant improvement over conventional Gaussian Mixture models. Our central contribution is the visualization of patterns which are learned by the neural network. With increasing network depth, these patterns represent more and more intricate electromyographic activity.
在基于肌电图(EMG)的无声语音接口背景下,我们报告了从面部肌电图数据中对语音单元和语音特征进行分类的情况。在本文中,我们表明深度神经网络可用于执行此分类任务,相较于传统的高斯混合模型有显著改进。我们的核心贡献在于对神经网络所学习到的模式进行可视化。随着网络深度的增加,这些模式代表了越来越复杂的肌电活动。