Kollmeier B, Wesselkamp M
AG Medizinische Physik, Universität Oldenburg, Germany.
J Acoust Soc Am. 1997 Oct;102(4):2412-21. doi: 10.1121/1.419624.
A German sentence test was developed which is comprised of 20 test lists of ten sentences each. The test corpus is a selection from sentences for speech quality evaluation recorded with a male unschooled speaker. Performance-intensity curves were measured for each individual sentence in a speech-simulating babble noise with a total of 40 normal-hearing listeners. Based on these data and the phonemic transcription of the 200 sentences selected from the underlying speech corpus, 20 test lists were composed using a numerical optimization process. These 20 test lists are highly equivalent with respect to their performance-intensity curves, the number of words within each test list, the number of phonemes within each test list, and approximately the frequency distribution of the phonemes which approximates the phoneme frequency distribution of the German language. The equivalence of the respective performance-intensity curves was demonstrated in an independent experiment with 20 normal-hearing listeners. In addition, a comparison was performed between the "objective" intelligibility measurements and two "subjective" speech intelligibility rating methods employing the same materials. As a result, both subjective assessment procedures correlate highly with each other and with the "objective" procedure across sentences. This underlines the applicability and validity of the test in combination with time-saving subjective assessment methods. Moreover, the variability in performance across different sentences correlates inversely with the RMS level of the respective sentence. This indicates that an adjustment of sentence material with respect to RMS level already yield reasonably homogeneous test material with respect to intelligibility.
开发了一种德语句子测试,它由20个测试列表组成,每个列表包含10个句子。测试语料库是从一名未受过教育的男性说话者录制的用于语音质量评估的句子中挑选出来的。在模拟语音的嘈杂噪声中,对总共40名听力正常的听众测量了每个句子的性能-强度曲线。基于这些数据以及从基础语音语料库中选出的200个句子的音素转录,通过数值优化过程组成了20个测试列表。这20个测试列表在性能-强度曲线、每个测试列表中的单词数量、每个测试列表中的音素数量以及近似德语音素频率分布的音素频率分布方面高度等效。在一项针对20名听力正常的听众的独立实验中证明了各自性能-强度曲线的等效性。此外,还使用相同的材料对“客观”可懂度测量与两种“主观”语音可懂度评级方法进行了比较。结果,两种主观评估程序彼此之间以及与跨句子的“客观”程序都高度相关。这强调了该测试与节省时间的主观评估方法相结合的适用性和有效性。此外,不同句子的性能变异性与相应句子的均方根电平呈负相关。这表明,就均方根电平而言对句子材料进行调整,已经能够产生在可懂度方面相当均匀的测试材料。