Leinonen L, Kangas J, Torkkola K, Juvas A
Department of Physiology, University of Helsinki.
J Speech Hear Res. 1992 Apr;35(2):287-95. doi: 10.1044/jshr.3502.287.
The vowel [a:] in a test word, judged normal or dysphonic, was examined with the Self-Organizing Map; the artificial neural network algorithm of Kohonen. The algorithm produces two-dimensional representations (maps) of speech. Input to the acoustic maps consisted of 15-component spectral vectors calculated at 9.83-msec intervals from short-time power spectra. The male and female maps were first calculated from the speech of healthy subjects and then the [a:] samples (15 successive spectral vectors) were examined on the maps. The dysphonic voices deviated from the norm both in the composition of the short-time power spectra (characterized by the dislocation of the trajectory pattern on the map) and in the stability of the spectrum during the performance (characterized by the pattern of the trajectory on the map). Rough voices were distinguished from breathy ones by their patterns on the map. With the limited speech material, an index for the degree of pathology could not be determined. A self-organized acoustic map provides an on-line visual representation of voice and speech in an easily understandable form. The method is thus suitable not only for diagnostic but also for educational and therapeutic purposes.
在一个测试单词中,被判定为正常或发音障碍的元音[a:],通过自组织映射(Self-Organizing Map)进行检查;这是一种Kohonen人工神经网络算法。该算法生成语音的二维表示(映射图)。声学映射图的输入由从短时功率谱以9.83毫秒间隔计算出的15维谱向量组成。男性和女性的映射图首先根据健康受试者的语音计算得出,然后在这些映射图上检查[a:]样本(15个连续的谱向量)。发音障碍的声音在短时功率谱的组成(以映射图上轨迹模式的错位为特征)以及发音过程中频谱的稳定性(以映射图上的轨迹模式为特征)方面均偏离正常。粗糙的声音和呼吸声在映射图上的模式有所不同。由于语音材料有限,无法确定病理程度的指标。自组织声学映射图以易于理解的形式提供了语音的在线可视化表示。因此,该方法不仅适用于诊断,还适用于教育和治疗目的。