Berisha Visar, Sandoval Steven, Utianski Rene, Liss Julie, Spanias Andreas
Department of Speech and Hearing Science, Arizona State University, Tempe, AZ 85287.
School of ECEE, SenSIP Center, Arizona State University, Tempe, AZ 85287.
Proc IEEE Int Conf Acoust Speech Signal Process. 2013:7562-7566. doi: 10.1109/ICASSP.2013.6639133.
The general aim of this work is to learn a unique statistical signature for the state of a particular speech pathology. We pose this as a speaker identification problem for dysarthric individuals. To that end, we propose a novel algorithm for feature selection that aims to minimize the effects of speaker-specific features (e.g., fundamental frequency) and maximize the effects of pathology-specific features (e.g., vocal tract distortions and speech rhythm). We derive a cost function for optimizing feature selection that simultaneously trades off between these two competing criteria. Furthermore, we develop an efficient algorithm that optimizes this cost function and test the algorithm on a set of 34 dysarthric and 13 healthy speakers. Results show that the proposed method yields a set of features related to the speech disorder and not an individual's speaking style. When compared to other feature-selection algorithms, the proposed approach results in an improvement in a disorder fingerprinting task by selecting features that are specific to the disorder.
这项工作的总体目标是学习特定言语病理学状态的独特统计特征。我们将此作为构音障碍个体的说话者识别问题。为此,我们提出了一种新颖的特征选择算法,旨在最小化说话者特定特征(例如基频)的影响,并最大化病理学特定特征(例如声道畸变和言语节奏)的影响。我们推导了一个用于优化特征选择的代价函数,该函数在这两个相互竞争的标准之间进行权衡。此外,我们开发了一种优化此代价函数的高效算法,并在一组34名构音障碍者和13名健康说话者上测试了该算法。结果表明,所提出的方法产生了一组与言语障碍相关的特征,而不是个人的说话风格。与其他特征选择算法相比,所提出的方法通过选择特定于该障碍的特征,在障碍指纹识别任务中取得了改进。