Department of Electrical and Computer Engineering, Queen's University, Kingston, Ontario K7L 3N6, Canada.
Department of Speech and Hearing Science, University of Illinois Urbana-Champaign, Champaign, Illinois 61820, USA.
JASA Express Lett. 2023 Mar;3(3):035207. doi: 10.1121/10.0017648.
Many existing speech intelligibility prediction (SIP) algorithms can only account for acoustic factors affecting speech intelligibility and cannot predict intelligibility across corpora with different linguistic predictability. To address this, a linguistic component was added to five existing SIP algorithms by estimating linguistic corpus predictability using a pre-trained language model. The results showed improved SIP performance in terms of correlation and prediction error over a mixture of four datasets, each with a different English open-set corpus.
许多现有的语音可懂度预测 (SIP) 算法只能考虑影响语音可懂度的声学因素,而不能预测不同语言可预测性的语料库之间的可懂度。为了解决这个问题,通过使用预先训练的语言模型来估计语言语料库的可预测性,向五个现有的 SIP 算法中添加了语言成分。结果表明,在四个数据集的混合体中,在相关性和预测误差方面,SIP 性能得到了提高,每个数据集都有不同的英语开放式语料库。