Nix D A, Papcun G, Hogden J, Zlokarnik I
Computer Research and Applications Speech Project, Los Alamos National Laboratory, New Mexico 87545, USA.
J Acoust Soc Am. 1996 Jun;99(6):3707-17. doi: 10.1121/1.414968.
Desirable characteristics of a vocal-tract parametrization include accuracy, low dimensionality, and generalizability across speakers and languages. A low-dimensional, speaker-independent linear parametrization of vowel tongue shapes can be obtained using the PARAFAC three-mode factor analysis procedure [Harshman et al., J. Acoust. Soc. Am. 62, 693-707 (1977)]. Harshman et al. applied PARAFAC to midsagittal x-ray vowel data from five English speakers, reporting that two speaker-independent factors are required to accurately represent the tongue shape measured along anatomically normalized vocal-tract diameter grid lines. Subsequently, the cross-linguistic generality of this parametrization was brought into question by the application of PARAFAC to Icelandic vowel data, where three nonorthogonal factors were reported [Jackson, J. Acoust. Soc. Am. 84, 124-143 (1988)]. This solution is shown to be degenerate; a reanalysis of Jackson's Icelandic data produces two factors that match Harshman et al.'s factors for English vowels, contradicting Jackson's distinction between English and Icelandic language-specific "articulatory primes". To obtain vowel factors not constrained by artificial measurement grid lines, x-ray tongue shape traces of six English speakers were marked with 13 equally spaced points. PARAFAC analysis of this unconstrained (x,y) coordinate data results in two factors that are clearly interpretable in terms of the traditional vowel quality dimensions front/back, high/low.
声道参数化的理想特性包括准确性、低维度以及跨说话者和语言的通用性。使用PARAFAC三模态因子分析程序[哈什曼等人,《美国声学学会杂志》62, 693 - 707 (1977)],可以获得元音舌形的低维、与说话者无关的线性参数化。哈什曼等人将PARAFAC应用于来自五位讲英语者的矢状面x光元音数据,报告称需要两个与说话者无关的因子来准确表示沿解剖学归一化声道直径网格线测量的舌形。随后,通过将PARAFAC应用于冰岛语元音数据,这种参数化的跨语言通用性受到了质疑,在冰岛语元音数据中报告了三个非正交因子[杰克逊,《美国声学学会杂志》84, 124 - 143 (1988)]。结果表明这个解决方案是退化的;对杰克逊的冰岛语数据进行重新分析得到了两个与哈什曼等人针对英语元音的因子相匹配的因子,这与杰克逊对英语和冰岛语特定语言的“发音素”的区分相矛盾。为了获得不受人工测量网格线约束的元音因子,对六位讲英语者的x光舌形轨迹用13个等距点进行了标记。对这种无约束的(x, y)坐标数据进行PARAFAC分析,得到了两个在传统元音质量维度前后、高低方面清晰可解释的因子。