Suppr超能文献

PVR-AFM:一种基于非线性结构的病理性嗓音修复系统。

PVR-AFM: A Pathological Voice Repair System based on Non-linear Structure.

机构信息

School of Electrical and Information Engineering, Tianjin University, Tianjin, China, 300072.

School of Electrical and Information Engineering, Tianjin University, Tianjin, China, 300072.

出版信息

J Voice. 2023 Sep;37(5):648-662. doi: 10.1016/j.jvoice.2021.05.010. Epub 2021 Jul 5.

Abstract

OBJECTIVE

Speech signal processing has become an important technique to ensure that the voice interaction system communicates accurately with the user by improving the clarity or intelligibility of speech signals. However, most existing works only focus on whether to process the voice of average human but ignore the communication needs of individuals suffering from voice disorder, including voice-related professionals, older people, and smokers. To solve this demand, it is essential to design a non-invasive repair system that processes pathological voices.

METHODS

In this paper, we propose a repair system for multiple polyp vowels, such as /a/, /i/ and /u/. We utilize a non-linear model based on amplitude-modulation (AM) and a frequency-modulation (FM) structure to extract the pitch and formant of pathological voice. To solve the fracture and instability of pitch, we provide a pitch extraction algorithm, which ensures that pitch's stability and avoids the errors of double pitch caused by the instability of low-frequency signal. Furthermore, we design a formant reconstruction mechanism, which can effectively determine the frequency and bandwidth to accomplish formant repair.

RESULTS

Finally, spectrum observation and objective indicators show that the system has better performance in improving the intelligibility of pathological speech.

摘要

目的

语音信号处理已成为一项重要技术,通过提高语音信号的清晰度或可理解度,确保语音交互系统能够与用户进行准确的交流。然而,大多数现有研究仅关注是否处理普通人类的声音,而忽略了患有语音障碍的个体的交流需求,包括语音相关专业人员、老年人和吸烟者。为了解决这一需求,设计一个处理病理性声音的非侵入式修复系统至关重要。

方法

在本文中,我们提出了一种针对多息肉元音(如/a/、/i/和/u/)的修复系统。我们利用基于幅度调制(AM)和频率调制(FM)结构的非线性模型来提取病理性语音的基频和共振峰。为了解决基频的断裂和不稳定性问题,我们提供了一种基频提取算法,该算法确保了基频的稳定性,并避免了低频信号不稳定导致的双重基频错误。此外,我们设计了一种共振峰重建机制,能够有效地确定频率和带宽,以完成共振峰修复。

结果

最后,频谱观察和客观指标表明,该系统在提高病理性语音的可理解度方面具有更好的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验