• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

次谐波与基波比(SHR)计算的性能评估。

Performance Evaluation of Subharmonic-to-Harmonic Ratio (SHR) Computation.

机构信息

Antonio Salieri Department of Vocal Studies and Vocal Research in Music Education, University of Music and Performing Arts Vienna, Vienna, Austria.

出版信息

J Voice. 2021 May;35(3):365-375. doi: 10.1016/j.jvoice.2019.11.005. Epub 2020 Mar 9.

DOI:10.1016/j.jvoice.2019.11.005
PMID:32165022
Abstract

Subharmonics are an important class of voice signals, relevant for speech, pathological voice, singing, and animal bioacoustics. They arise from special cases of amplitude (AM) or frequency modulation (FM) of the time-domain signal. Surprisingly, to date there is only one open source subharmonics detector available to the scientific community: Sun's subharmonic-to-harmonic ratio (SHR). Here, this algorithm was subjected to a formal evaluation with two data sets of synthesized and empirical speech samples. Both data sets consisted of electroglottographic (EGG) signals, ie, a physiological correlate of vocal fold oscillation that bypasses vocal tract acoustics. Data Set I contained 2560 synthesized EGG signals with varying degrees of AM and FM, fundamental frequency (fo), periodicity, and signal-to-noise ratio (SNR). Data Set II was made up of 25 EGG samples extracted from the CMU Arctic speech data base. For a "ground truth" of subharmonicity, these samples were manually annotated by a group of five external experts. Analysis of the synthesized data suggested that the SHR metric is relatively robust as long as the subharmonic modulation extent is below 0.35 and 0.7 for the FM and AM scenarios, respectively. In the CMU Arctic speech data samples, the SHR analysis reached a maximum sensitivity of about 87% at a specificity of over 90%, but only for adaptive algorithm parameter settings. In contrast, the algorithm's default parameter settings could only successfully classify about 9% of all subharmonic instances. The SHR is a useful metric for assessing the degree of subharmonics contained in voice signals, but only at adaptive parameter settings. In particular, the frequency ceiling should be set to five times the highest fo, and the frame length to at least five times the largest fundamental period of the analyzed signal. For subharmonic classification a threshold of SHR  ≥  0.01 is recommended.

摘要

次谐波是一类重要的语音信号,与语音、病理性语音、歌唱和动物生物声学都有关。它们是时域信号的幅度调制(AM)或频率调制(FM)的特殊情况产生的。令人惊讶的是,到目前为止,科学界只有一种可用的开源次谐波检测器:Sun 的次谐波与谐波比(SHR)。在这里,该算法在两个合成和经验语音样本数据集上进行了正式评估。这两个数据集都由声门图(EGG)信号组成,即声带振动的生理相关信号,它绕过了声道声学。数据集 I 包含 2560 个具有不同程度 AM 和 FM、基频(fo)、周期性和信噪比(SNR)的合成 EGG 信号。数据集 II 由 25 个从 CMU 北极语音数据库中提取的 EGG 样本组成。为了获得次谐波的“真实情况”,这些样本由一组五名外部专家手动注释。对合成数据的分析表明,只要次谐波调制幅度分别低于 FM 和 AM 情况的 0.35 和 0.7,SHR 度量就相对稳健。在 CMU 北极语音数据样本中,SHR 分析在特异性超过 90%的情况下,达到了约 87%的最大灵敏度,但仅适用于自适应算法参数设置。相比之下,算法的默认参数设置只能成功分类所有次谐波实例的约 9%。SHR 是评估语音信号中所含次谐波程度的有用指标,但仅在自适应参数设置下有效。特别是,频率上限应设置为最高 fo 的五倍,帧长度应至少设置为分析信号最大基周期的五倍。建议将 SHR≥0.01 作为次谐波分类的阈值。

相似文献

1
Performance Evaluation of Subharmonic-to-Harmonic Ratio (SHR) Computation.次谐波与基波比(SHR)计算的性能评估。
J Voice. 2021 May;35(3):365-375. doi: 10.1016/j.jvoice.2019.11.005. Epub 2020 Mar 9.
2
Freddie Mercury-acoustic analysis of speaking fundamental frequency, vibrato, and subharmonics.弗雷迪·默丘里——说话基频、颤音和次谐波的声学分析。
Logoped Phoniatr Vocol. 2017 Apr;42(1):29-38. doi: 10.3109/14015439.2016.1156737. Epub 2016 Apr 15.
3
Fundamental Frequency Estimation of Low-quality Electroglottographic Signals.基频估计的低质量声门电图信号。
J Voice. 2019 Jul;33(4):401-411. doi: 10.1016/j.jvoice.2018.01.003. Epub 2018 May 31.
4
Electroglottographic and acoustic analysis of voice in children with vocal nodules.声带小结患儿嗓音的电声门图及声学分析
Int J Pediatr Otorhinolaryngol. 2019 Jul;122:82-88. doi: 10.1016/j.ijporl.2019.03.030. Epub 2019 Apr 2.
5
Inspiratory Vocal Fry: Anatomical and Physiological Aspects, Application in Speech Therapy, Vocal Pedagogy and Singing. A Pilot study.吸气性发声震颤:解剖学和生理学方面,在言语治疗、声乐教学和歌唱中的应用。一项初步研究。
J Voice. 2021 May;35(3):394-399. doi: 10.1016/j.jvoice.2019.10.004. Epub 2019 Nov 7.
6
Towards a Singing Voice Multi-Sensor Analysis Tool: System Design, and Assessment Based on Vocal Breathiness.面向歌声多传感器分析工具:基于发声呼吸音的系统设计与评估。
Sensors (Basel). 2021 Nov 30;21(23):8006. doi: 10.3390/s21238006.
7
Overdrive and Edge as Refiners of "Belting"?: An Empirical Study Qualifying and Categorizing "Belting" Based on Audio Perception, Laryngostroboscopic Imaging, Acoustics, LTAS, and EGG.作为“压喉音”改良工具的激励效果和边缘效果:一项基于听觉感知、喉动态镜成像、声学、长时平均谱和食管电图对“压喉音”进行鉴定和分类的实证研究
J Voice. 2017 May;31(3):385.e11-385.e22. doi: 10.1016/j.jvoice.2016.09.006. Epub 2016 Nov 18.
8
Comparing Vocal Fold Contact Criteria Derived From Audio and Electroglottographic Signals.比较源自音频和电子声门图信号的声带接触标准。
J Voice. 2016 Jul;30(4):381-8. doi: 10.1016/j.jvoice.2015.05.015. Epub 2015 Nov 3.
9
Voice Characteristics of Young Girl Role in Kunqu Opera.昆剧少女角色的嗓音特点。
J Voice. 2019 Nov;33(6):945.e19-945.e25. doi: 10.1016/j.jvoice.2018.07.011. Epub 2018 Aug 14.
10
An Examination of the Relationship Between Electroglottographic Contact Quotient, Electroglottographic Decontacting Phase Profile, and Acoustical Spectral Moments.电声门图接触商、电声门图断开相轮廓与声学频谱矩之间关系的研究
J Voice. 2015 Sep;29(5):519-29. doi: 10.1016/j.jvoice.2014.10.016. Epub 2015 Mar 17.

引用本文的文献

1
Bioacoustic fundamental frequency estimation: a cross-species dataset and deep learning baseline.生物声学基频估计:一个跨物种数据集及深度学习基线
Bioacoustics. 2025;34(4):419-446. doi: 10.1080/09524622.2025.2500380. Epub 2025 Jun 2.
2
How to analyse and manipulate nonlinear phenomena in voice recordings.如何分析和处理语音记录中的非线性现象。
Philos Trans R Soc Lond B Biol Sci. 2025 Apr 3;380(1923):20240003. doi: 10.1098/rstb.2024.0003.
3
Nonlinear phenomena in mammalian vocal communication: an introduction and scoping review.
哺乳动物发声交流中的非线性现象:引言与范围综述
Philos Trans R Soc Lond B Biol Sci. 2025 Apr 3;380(1923):20240017. doi: 10.1098/rstb.2024.0017.
4
Speaker discrimination performance for "easy" versus "hard" voices in style-matched and -mismatched speech.在风格匹配和不匹配的语音中,“容易”和“困难”声音的说话人辨别性能。
J Acoust Soc Am. 2022 Feb;151(2):1393. doi: 10.1121/10.0009585.