Godino-Llorente J I, Gómez-Vilda P
Universidad Politécnica de Madrid, Escuela Universitaria de Ingeniería Técnica de Telecomunicación, Dpt. of Ingeniería de Circuitos y Sistemas, Ctra. Valencia Km. 7, 28031, Madrid.
IEEE Trans Biomed Eng. 2004 Feb;51(2):380-4. doi: 10.1109/TBME.2003.820386.
It is well known that vocal and voice diseases do not necessarily cause perceptible changes in the acoustic voice signal. Acoustic analysis is a useful tool to diagnose voice diseases being a complementary technique to other methods based on direct observation of the vocal folds by laryngoscopy. Through the present paper two neural-network based classification approaches applied to the automatic detection of voice disorders will be studied. Structures studied are multilayer perceptron and learning vector quantization fed using short-term vectors calculated accordingly to the well-known Mel Frequency Coefficient cepstral parameterization. The paper shows that these architectures allow the detection of voice disorders--including glottic cancer--under highly reliable conditions. Within this context, the Learning Vector quantization methodology demonstrated to be more reliable than the multilayer perceptron architecture yielding 96% frame accuracy under similar working conditions.
众所周知,嗓音和语音疾病不一定会在声学语音信号中引起可察觉的变化。声学分析是诊断语音疾病的一种有用工具,是喉镜直接观察声带的其他方法的补充技术。通过本文,将研究两种基于神经网络的分类方法应用于语音障碍的自动检测。所研究的结构是多层感知器和学习向量量化,使用根据著名的梅尔频率系数倒谱参数化计算的短期向量进行馈送。本文表明,这些架构能够在高度可靠的条件下检测语音障碍,包括声门癌。在此背景下,学习向量量化方法被证明比多层感知器架构更可靠,在类似工作条件下产生96%的帧准确率。