IEEE Trans Biomed Eng. 2021 Oct;68(10):2986-2996. doi: 10.1109/TBME.2021.3058424. Epub 2021 Sep 20.
Evaluation of hypernasality requires extensive perceptual training by clinicians and extending this training on a large scale internationally is untenable; this compounds the health disparities that already exist among children with cleft. In this work, we present the objective hypernasality measure (OHM), a speech-based algorithm that automatically measures hypernasality in speech, and validate it relative to a group of trained clinicians.
We trained a deep neural network (DNN) on approximately 100 hours of a publicly-available healthy speech corpus to detect the presence of nasal acoustic cues generated through the production of nasal consonants and nasalized phonemes in speech. Importantly, this model does not require any clinical data for training. The posterior probabilities of the deep learning model were aggregated at the sentence and speaker-levels to compute the OHM.
The results showed that the OHM was significantly correlated with perceptual hypernasality ratings from the Americleft database (r = 0.797, p < 0.001) and the New Mexico Cleft Palate Center (NMCPC) database (r = 0.713, p < 0.001). In addition, we evaluated the relationship between the OHM and articulation errors; the sensitivity of the OHM in detecting the presence of very mild hypernasality; and established the internal reliability of the metric. Further, the performance of the OHM was compared with a DNN regression algorithm directly trained on the hypernasal speech samples.
The results indicate that the OHM is able to measure the severity of hypernasality on par with Americleft-trained clinicians on thisdataset.
评估超鼻音需要临床医生进行广泛的感知训练,而在国际上大规模扩展这种训练是不可行的;这加剧了已经存在于裂唇儿童中的健康差距。在这项工作中,我们提出了客观超鼻音度量(OHM),这是一种基于语音的算法,可自动测量语音中的超鼻音,并将其与一组经过训练的临床医生进行验证。
我们使用大约 100 小时的公开健康语音语料库对深度神经网络(DNN)进行训练,以检测通过语音产生鼻腔辅音和鼻音化音素产生的鼻腔声学线索的存在。重要的是,这个模型不需要任何临床数据进行训练。深度学习模型的后验概率在句子和说话者级别上进行聚合,以计算 OHM。
结果表明,OHM 与 Americleft 数据库(r = 0.797,p < 0.001)和新墨西哥腭裂中心(NMCPC)数据库(r = 0.713,p < 0.001)的感知超鼻音评分显著相关。此外,我们评估了 OHM 与发音错误之间的关系;OHM 检测非常轻微超鼻音的灵敏度;并建立了该指标的内部可靠性。此外,还比较了 OHM 的性能与直接在超鼻音语音样本上训练的 DNN 回归算法。
结果表明,OHM 能够在这个数据集上与经过 Americleft 训练的临床医生一样,衡量超鼻音的严重程度。