Biomedical Engineering Department, Jordan University of Science & Technology, PO Box 3030, Irbid 22110, Jordan.
Med Biol Eng Comput. 2011 Jul;49(7):811-8. doi: 10.1007/s11517-011-0751-1. Epub 2011 Mar 16.
This work attempts to recognize the Arabic vowels based on facial electromyograph (EMG) signals, to be used for people with speech impairment and for human computer interface. Vowels were selected since they are the most difficult letters to recognize by people in Arabic language. Twenty subjects (7 females and 13 males) were asked to pronounce three Arabic vowels continuously in a random order. Facial EMG signals were recorded over three channels from the three main facial muscles that are responsible for speech. The EMG signals are then pre-processed to eliminate noise and interference signals. Segmentation procedure was implemented to extract the time event that corresponds to each vowel based on a moving standard deviation window. The accuracy of the segmentation procedure was found to be 94%. The recognition of the vowels was carried out by extracting features from the EMG in three domains: the temporal, the spectral, and the time frequency using the wavelet packet transform. Classification of the extracted features was then finally performed using different classification methods implemented in the WEKA software. The random forest classifier with time frequency features showed the best performance with an accuracy of 77% evaluated using a 10-fold cross-validation.
本工作试图基于面部肌电图 (EMG) 信号识别阿拉伯语元音,用于语音障碍者和人机接口。之所以选择元音,是因为它们是阿拉伯语中最难识别的字母。二十名受试者(7 名女性和 13 名男性)被要求以随机顺序连续发出三个阿拉伯元音。面部 EMG 信号从负责说话的三个主要面部肌肉上的三个通道进行记录。然后对 EMG 信号进行预处理,以消除噪声和干扰信号。通过使用移动标准差窗口,实现了分段程序以提取与每个元音相对应的时间事件。分段程序的准确性被发现为 94%。通过使用 WEKA 软件中实现的不同分类方法,从 EMG 在三个域(时域、频域和时频域)中提取特征来进行元音识别。使用时频特征的随机森林分类器表现出最佳性能,使用 10 折交叉验证评估的准确率为 77%。