Wu Huiyi, Soraghan John, Lowit Anja, Di Caterina Gaetano
Annu Int Conf IEEE Eng Med Biol Soc. 2018 Jul;2018:1-4. doi: 10.1109/EMBC.2018.8513222.
Acoustic analysis using signal processing tools can be used to extract voice features to distinguish whether a voice is pathological or healthy. The proposed work uses spectrogram of voice recordings from a voice database as the input to a Convolutional Neural Network (CNN) for automatic feature extraction and classification of disordered and normal voice. The novel classifier achieved 88.5%, 66.2% and 77.0% accuracy on training, validation and testing data set respectively on 482 normal and 482 organic dysphonia speech files. It reveals that the proposed novel algorithm on the Saarbruecken Voice Database can effectively been used for screening pathological voice recordings.
使用信号处理工具进行声学分析可用于提取语音特征,以区分语音是否病态或健康。所提出的工作将语音数据库中语音记录的频谱图作为卷积神经网络(CNN)的输入,用于自动提取紊乱语音和正常语音的特征并进行分类。该新型分类器在482个正常语音文件和482个器质性发声障碍语音文件的训练、验证和测试数据集上分别达到了88.5%、66.2%和77.0%的准确率。结果表明,在萨尔布吕肯语音数据库上提出的新型算法可有效地用于筛选病态语音记录。