Suppr超能文献

平移不变离散小波变换及其在语音波形分析中的应用。

The shift-invariant discrete wavelet transform and application to speech waveform analysis.

作者信息

Enders Jörg, Geng Weihua, Li Peijun, Frazier Michael W, Scholl David J

机构信息

Department of Mathematics, Michigan State University, East Lansing, Michigan 48824-1027, USA.

出版信息

J Acoust Soc Am. 2005 Apr;117(4 Pt 1):2122-33. doi: 10.1121/1.1869732.

Abstract

The discrete wavelet transform may be used as a signal-processing tool for visualization and analysis of nonstationary, time-sampled waveforms. The highly desirable property of shift invariance can be obtained at the cost of a moderate increase in computational complexity, and accepting a least-squares inverse (pseudoinverse) in place of a true inverse. A new algorithm for the pseudoinverse of the shift-invariant transform that is easier to implement in array-oriented scripting languages than existing algorithms is presented together with self-contained proofs. Representing only one of the many and varied potential applications, a recorded speech waveform illustrates the benefits of shift invariance with pseudoinvertibility. Visualization shows the glottal modulation of vowel formants and frication noise, revealing secondary glottal pulses and other waveform irregularities. Additionally, performing sound waveform editing operations (i.e., cutting and pasting sections) on the shift-invariant wavelet representation automatically produces quiet, click-free section boundaries in the resulting sound. The capabilities of this wavelet-domain editing technique are demonstrated by changing the rate of a recorded spoken word. Individual pitch periods are repeated to obtain a half-speed result, and alternate individual pitch periods are removed to obtain a double-speed result. The original pitch and formant frequencies are preserved. In informal listening tests, the results are clear and understandable.

摘要

离散小波变换可用作一种信号处理工具,用于可视化和分析非平稳的时间采样波形。通过适度增加计算复杂度,并采用最小二乘逆(伪逆)代替真正的逆,可以获得非常理想的移位不变性。本文提出了一种用于移位不变变换伪逆的新算法,与现有算法相比,该算法在面向数组的脚本语言中更易于实现,并给出了完整的证明。作为众多潜在应用中的一个示例,一个录制的语音波形展示了具有伪可逆性的移位不变性的优点。可视化显示了元音共振峰的声门调制和摩擦噪声,揭示了二次声门脉冲和其他波形不规则性。此外,在移位不变小波表示上执行声音波形编辑操作(即剪切和粘贴片段)会在生成的声音中自动产生安静、无咔哒声的片段边界。通过改变录制的口语单词的语速,展示了这种小波域编辑技术的能力。重复单个基音周期以获得半速结果,并去除交替的单个基音周期以获得双倍速结果。原始的基音和共振峰频率得以保留。在非正式的听力测试中,结果清晰易懂。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验