Suppr超能文献

使用独立频带信号相关和倍频程谱突出峰对腭裂语音咽擦音的声学分析和检测。

Acoustic analysis and detection of pharyngeal fricative in cleft palate speech using correlation of signals in independent frequency bands and octave spectrum prominent peak.

机构信息

College of Electrical Engineering, Sichuan University, 610065, Chengdu, China.

West China Hospital of Stomatology, Sichuan University, 610041, Chengdu, China.

出版信息

Biomed Eng Online. 2020 May 27;19(1):36. doi: 10.1186/s12938-020-00782-3.

Abstract

BACKGROUND

Pharyngeal fricative is one typical compensatory articulation error of cleft palate speech. It passively influences daily communication for people who suffer from it. The automatic detection of pharyngeal fricatives in cleft palate speech can provide information for clinical doctors and speech-language pathologists to aid in diagnosis.

RESULTS

This paper proposes two features (CSIFs: correlation of signals in independent frequency bands; OSPP: octave spectrum prominent peak) to detect pharyngeal fricative speech. CSIFs feature is proposed to detect the distribution characteristics of frequency components in pharyngeal fricative speech caused by the changed place of articulation and movement of articulators. While OSPP is presented to reflect the concentration degree of prominent peak which is closely related to the place of articulation in pharyngeal fricative, both features are investigated to relate to the altered production process of pharyngeal fricative. To evaluate the capability of these two features to detect pharyngeal fricative, we collected a speech database covering all the types of initial consonants in which pharyngeal fricatives occur. In this detection task, the classifier used to discriminate pharyngeal fricative speech and normal speech is based on ensemble learning.

CONCLUSION

The detection accuracy obtained with CSIFs and OSPP features ranges from 83.5 to 84.5% and from 85 to 87%, respectively. When these two features are combined, the detection accuracy for pharyngeal fricative speech ranges from 88 to 89%, with an AUC (area under the receiver operating characteristic curve) value of 93%.

摘要

背景

咽擦音是腭裂语音的一种典型代偿性发音错误。它会对患有腭裂语音的人的日常交流产生被动影响。自动检测腭裂语音中的咽擦音可以为临床医生和言语语言病理学家提供信息,以辅助诊断。

结果

本文提出了两个特征(CSIFs:独立频带信号的相关性;OSPP:倍频程谱突出峰)来检测咽擦音语音。CSIFs 特征用于检测由发音部位改变和发音器官运动引起的咽擦音语音中频率分量的分布特征。而 OSPP 则用于反映与发音部位密切相关的突出峰的集中程度,这两个特征都被研究用于反映咽擦音产生过程的改变。为了评估这两个特征检测咽擦音的能力,我们收集了一个涵盖所有发生咽擦音的辅音类型的语音数据库。在这个检测任务中,用于区分咽擦音语音和正常语音的分类器基于集成学习。

结论

CSIFs 和 OSPP 特征的检测准确率分别在 83.5%到 84.5%和 85%到 87%之间。当这两个特征结合使用时,咽擦音语音的检测准确率在 88%到 89%之间,ROC 曲线下面积(AUC)值为 93%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2fc/7251748/32adf72b472b/12938_2020_782_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验