Faulkner A, Walliker J R, Howard I S, Ball V, Fourcin A J
Department of Phonetics and Linguistics, University College London, United Kingdom.
Scand Audiol Suppl. 1993;38:124-35.
Two new developments in speech pattern processing hearing aids will be described. The first development is the use of compound speech pattern coding. Speech information which is invisible to the lipreader was encoded in terms of three acoustic speech factors; the voice fundamental frequency pattern, coded as a sinusoid, the presence of aperiodic excitation, coded as a low-frequency noise, and the wide-band amplitude envelope, coded by amplitude modulation of the sinusoid and noise signals. Each element of the compound stimulus was individually matched in frequency and intensity to the listener's receptive range. Audio-visual speech receptive assessments in five profoundly hearing-impaired listeners were performed to examine the contributions of adding voiceless and amplitude information to the voice fundamental frequency pattern, and to compare these codings to amplified speech. In both consonant recognition and connected discourse tracking (CDT), all five subjects showed an advantage from the addition of amplitude information to the fundamental frequency pattern. In consonant identification, all five subjects showed further improvements in performance when voiceless speech excitation was additionally encoded together with amplitude information, but this effect was not found in CDT. The addition of voiceless information to voice fundamental frequency information did not improve performance in the absence of amplitude information. Three of the subjects performed significantly better in at least one of the compound speech pattern conditions than with amplified speech, while the other two performed similarly with amplified speech and the best compound speech pattern condition. The three speech pattern elements encoded here may represent a near-optimal basis for an acoustic aid to lipreading for this group of listeners. The second development is the use of a trained multi-layer-perceptron (MLP) pattern classification algorithm as the basis for a robust real-time voice fundamental frequency extractor. This algorithm runs on a low-power digital signal processor which can be incorporated in a wearable hearing aid. Aided lipreading for speech in noise was assessed in the same five profoundly hearing-impaired listeners to compare the benefits of conventional hearing aids with those of an aid which provided MLP-based fundamental frequency information together with speech+noise amplitude information. The MLP-based pattern element aid gave significantly better performance in the reception of consonantal voicing contrasts from speech in pink noise than that achieved with conventional amplification and consequently, it also gave better overall performance in audio-visual consonant identification.(ABSTRACT TRUNCATED AT 400 WORDS)
本文将介绍语音模式处理助听器的两项新进展。第一项进展是复合语音模式编码的应用。唇读难以察觉的语音信息依据三个声学语音因素进行编码:作为正弦波编码的语音基频模式、作为低频噪声编码的非周期性激励的存在,以及通过正弦波和噪声信号的幅度调制编码的宽带幅度包络。复合刺激的每个元素在频率和强度上分别与听众的接受范围相匹配。对五名极重度听力受损的听众进行了视听语音接受评估,以检验在语音基频模式中添加清音和幅度信息的作用,并将这些编码与放大语音进行比较。在辅音识别和连贯话语跟踪(CDT)中,所有五名受试者都显示出在基频模式中添加幅度信息带来的优势。在辅音识别中,当清音语音激励与幅度信息一起额外编码时,所有五名受试者的表现都有进一步提升,但在CDT中未发现此效果。在没有幅度信息的情况下,向语音基频信息中添加清音信息并未提高表现。三名受试者在至少一种复合语音模式条件下的表现明显优于放大语音,而另外两名受试者在放大语音和最佳复合语音模式条件下表现相似。这里编码 的三个语音模式元素可能代表了为这组听众提供声学唇读辅助的近乎最优基础。第二项进展是使用经过训练的多层感知器(MLP)模式分类算法作为强大的实时语音基频提取器的基础。该算法运行在一个低功耗数字信号处理器上,可集成到可穿戴助听器中。对同样五名极重度听力受损的听众进行了噪声环境下语音的辅助唇读评估,以比较传统助听器与提供基于MLP的基频信息以及语音加噪声幅度信息的助听器的效果。基于MLP的模式元素助听器在接收粉红噪声中语音的辅音浊音对比方面的表现明显优于传统放大方式,因此,在视听辅音识别中其整体表现也更好。(摘要截选至400词)