Department of Biophysics, University of Life Sciences, Akademicka 13, 20-950 Lublin, Poland.
Faculty of Physical Education and Health in Biała Podlaska, Józef Piłsudski University of Physical Education in Warsaw, Akademicka 2, 21-500 Biała Podlaska, Poland.
Sensors (Basel). 2022 Jan 1;22(1):321. doi: 10.3390/s22010321.
The presented paper introduces principal component analysis application for dimensionality reduction of variables describing speech signal and applicability of obtained results for the disturbed and fluent speech recognition process. A set of fluent speech signals and three speech disturbances-blocks before words starting with plosives, syllable repetitions, and sound-initial prolongations-was transformed using principal component analysis. The result was a model containing four principal components describing analysed utterances. Distances between standardised original variables and elements of the observation matrix in a new system of coordinates were calculated and then applied in the recognition process. As a classifying algorithm, the multilayer perceptron network was used. Achieved results were compared with outcomes from previous experiments where speech samples were parameterised with the Kohonen network application. The classifying network achieved overall accuracy at 76% (from 50% to 91%, depending on the dysfluency type).
本文介绍了主成分分析在降维描述语音信号变量中的应用,以及所得结果在语音识别过程中的应用。应用主成分分析对一组流利的语音信号和三种语音干扰(塞音前的词首阻塞、音节重复和声音起始延长)进行了转换。结果得到了一个包含四个主成分的模型,用于描述分析的话语。计算了标准化原始变量与新坐标系中观察矩阵元素之间的距离,然后将其应用于识别过程。多层感知器网络被用作分类算法。将分类网络的识别结果与之前使用 Kohonen 网络应用对语音样本进行参数化的实验结果进行了比较。分类网络的总体准确率为 76%(取决于不同的不流畅类型,准确率从 50%到 91%不等)。