• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在基于似然比的范式下重新审视元音共振峰频率的说话者辨别能力:不匹配说话风格的情况。

Revisiting the speaker discriminatory power of vowel formant frequencies under a likelihood ratio-based paradigm: The case of mismatched speaking styles.

作者信息

Cavalcanti Julio Cesar, Eriksson Anders, Barbosa Plinio A, Madureira Sandra

机构信息

Department of Linguistics, Stockholm University, Stockholm, Sweden.

Applied Linguistics and Language Studies Graduate Program, Pontifical Catholic University of São Paulo, São Paulo, Brazil.

出版信息

PLoS One. 2024 Dec 10;19(12):e0311363. doi: 10.1371/journal.pone.0311363. eCollection 2024.

DOI:10.1371/journal.pone.0311363
PMID:39656685
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11630611/
Abstract

Differentiating subjects through the comparison of their recorded speech is a common endeavor in speaker characterization. When using an acoustic-based approach, this task typically involves scrutinizing specific acoustic parameters and assessing their discriminatory capacity. This experimental study aimed to evaluate the speaker discriminatory power of vowel formants-resonance peaks in the vocal tract-in two different speaking styles: Dialogue and Interview. Different testing procedures were applied, specifically metrics compatible with the likelihood ratio paradigm. Only high-quality recordings were analyzed in this study. The participants were 20 male Brazilian Portuguese (BP) speakers from the same dialectal area. Two speaker-discriminatory power estimates were examined through Multivariate Kernel Density analysis: Log cost-likelihood ratios (Cllr) and equal error rates (EER). As expected, the discriminatory performance was stronger for style-matched analyses than for mismatched-style analyses. In order of relevance, F3, F4, and F1 performed the best in style-matched comparisons, as suggested by lower Cllr and EER values. F2 performed the worst intra-style in both Dialogue and Interview. The discriminatory power of all individual formants (F1-F4) appeared to be affected in the mismatched condition, demonstrating that discriminatory power is sensitive to style-driven changes in speech production. The combination of higher formants 'F3 + F4' outperformed the combination of lower formants 'F1 + F2'. However, in mismatched-style analyses, the magnitude of improvement in Cllr and EER scores increased as more formants were incorporated into the model. The best discriminatory performance was achieved when most formants were combined. Applying multivariate analysis not only reduced average Cllr and EER scores but also influenced the overall probability distribution, shifting the probability density distribution towards lower Cllr and EER values. In general, front and central vowels were found more speaker discriminatory than back vowels as far as the 'F1 + F2' relation was concerned.

摘要

通过比较受试者的录音语音来区分个体是说话者特征描述中的一项常见工作。在使用基于声学的方法时,这项任务通常涉及仔细检查特定的声学参数并评估它们的辨别能力。本实验研究旨在评估元音共振峰(声道中的共振峰值)在两种不同说话风格(对话和访谈)中的说话者辨别能力。应用了不同的测试程序,具体是与似然比范式兼容的指标。本研究仅分析了高质量的录音。参与者是来自同一方言地区的20名巴西葡萄牙语(BP)男性说话者。通过多变量核密度分析检查了两种说话者辨别能力估计值:对数成本似然比(Cllr)和等错误率(EER)。正如预期的那样,风格匹配分析的辨别性能比不匹配风格分析更强。按照相关性顺序,在风格匹配比较中,F3、F4和F1表现最佳,较低的Cllr和EER值表明了这一点。在对话和访谈中,F2在同一样式内表现最差。在不匹配条件下,所有单个共振峰(F1 - F4)的辨别能力似乎都受到了影响,这表明辨别能力对语音产生中风格驱动的变化很敏感。较高共振峰“F3 + F4”的组合优于较低共振峰“F1 + F2”的组合。然而,在不匹配风格分析中,随着更多共振峰被纳入模型,Cllr和EER分数的改善幅度会增加。当大多数共振峰组合在一起时,实现了最佳的辨别性能。应用多变量分析不仅降低了平均Cllr和EER分数,还影响了总体概率分布,将概率密度分布向较低的Cllr和EER值转移。一般来说,就“F1 + F2”关系而言,前元音和央元音比后元音具有更强的说话者辨别能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f813/11630611/3ecbbde6fce2/pone.0311363.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f813/11630611/11e22b62ba12/pone.0311363.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f813/11630611/10d4099f94c8/pone.0311363.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f813/11630611/84b311eb7544/pone.0311363.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f813/11630611/508945af5187/pone.0311363.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f813/11630611/9f07a6405371/pone.0311363.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f813/11630611/356665ac5d8e/pone.0311363.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f813/11630611/3ecbbde6fce2/pone.0311363.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f813/11630611/11e22b62ba12/pone.0311363.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f813/11630611/10d4099f94c8/pone.0311363.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f813/11630611/84b311eb7544/pone.0311363.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f813/11630611/508945af5187/pone.0311363.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f813/11630611/9f07a6405371/pone.0311363.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f813/11630611/356665ac5d8e/pone.0311363.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f813/11630611/3ecbbde6fce2/pone.0311363.g007.jpg

相似文献

1
Revisiting the speaker discriminatory power of vowel formant frequencies under a likelihood ratio-based paradigm: The case of mismatched speaking styles.在基于似然比的范式下重新审视元音共振峰频率的说话者辨别能力:不匹配说话风格的情况。
PLoS One. 2024 Dec 10;19(12):e0311363. doi: 10.1371/journal.pone.0311363. eCollection 2024.
2
On the speaker discriminatory power asymmetry regarding acoustic-phonetic parameters and the impact of speaking style.关于声学语音参数的说话者辨别能力不对称性及说话风格的影响。
Front Psychol. 2023 Apr 17;14:1101187. doi: 10.3389/fpsyg.2023.1101187. eCollection 2023.
3
Acoustic analysis of vowel formant frequencies in genetically-related and non-genetically related speakers with implications for forensic speaker comparison.对具有遗传和非遗传关系的发音人元音共振峰频率的声学分析及其对法庭说话人比较的影响。
PLoS One. 2021 Feb 18;16(2):e0246645. doi: 10.1371/journal.pone.0246645. eCollection 2021.
4
Multi-parametric analysis of speech timing in inter-talker identical twin pairs and cross-pair comparisons: Some forensic implications.多参数分析说话人时间在说话者相同的双胞胎对和交叉对比较中的表现:一些法医学上的启示。
PLoS One. 2022 Jan 21;17(1):e0262800. doi: 10.1371/journal.pone.0262800. eCollection 2022.
5
Multiparametric Analysis of Speaking Fundamental Frequency in Genetically Related Speakers Using Different Speech Materials: Some Forensic Implications.基于不同语音材料的遗传相关发音者说话基频的多参数分析:一些法医学启示。
J Voice. 2024 Jan;38(1):243.e11-243.e29. doi: 10.1016/j.jvoice.2021.08.013. Epub 2021 Oct 8.
6
Acoustic vowel analysis and speech intelligibility in young adult Hebrew speakers: Developmental dysarthria versus typical development.青年希伯来语说话者的声学元音分析和言语可懂度:发育性构音障碍与典型发育。
Int J Lang Commun Disord. 2021 Mar;56(2):283-298. doi: 10.1111/1460-6984.12598. Epub 2021 Feb 1.
7
Vowel reduction across tasks for male speakers of American English.以美式英语为母语的男性在不同任务中的元音弱化。
J Acoust Soc Am. 2016 Jul;140(1):369. doi: 10.1121/1.4955310.
8
Effects of speaking rate and vowel length on formant frequency displacement in Japanese.语速和元音长度对日语中元音共振峰频率偏移的影响。
Phonetica. 2009;66(3):129-49. doi: 10.1159/000235657. Epub 2009 Sep 14.
9
Effect of vocal effort on spectral properties of vowels.发声努力对元音频谱特性的影响。
J Acoust Soc Am. 1999 Jul;106(1):411-22. doi: 10.1121/1.428140.
10
Effect of Ageing on Acoustic Characteristics of Voice Pitch and Formants in Czech Vowels.年龄对捷克元音音高和共振峰的声学特征的影响。
J Voice. 2021 Nov;35(6):931.e21-931.e33. doi: 10.1016/j.jvoice.2020.02.022. Epub 2020 Mar 31.

本文引用的文献

1
Exploring the performance of automatic speaker recognition using twin speech and deep learning-based artificial neural networks.利用双语音和基于深度学习的人工神经网络探索自动说话人识别的性能。
Front Artif Intell. 2024 Feb 8;7:1287877. doi: 10.3389/frai.2024.1287877. eCollection 2024.
2
On the speaker discriminatory power asymmetry regarding acoustic-phonetic parameters and the impact of speaking style.关于声学语音参数的说话者辨别能力不对称性及说话风格的影响。
Front Psychol. 2023 Apr 17;14:1101187. doi: 10.3389/fpsyg.2023.1101187. eCollection 2023.
3
Multi-parametric analysis of speech timing in inter-talker identical twin pairs and cross-pair comparisons: Some forensic implications.
多参数分析说话人时间在说话者相同的双胞胎对和交叉对比较中的表现:一些法医学上的启示。
PLoS One. 2022 Jan 21;17(1):e0262800. doi: 10.1371/journal.pone.0262800. eCollection 2022.
4
Multiparametric Analysis of Speaking Fundamental Frequency in Genetically Related Speakers Using Different Speech Materials: Some Forensic Implications.基于不同语音材料的遗传相关发音者说话基频的多参数分析:一些法医学启示。
J Voice. 2024 Jan;38(1):243.e11-243.e29. doi: 10.1016/j.jvoice.2021.08.013. Epub 2021 Oct 8.
5
Acoustic analysis of vowel formant frequencies in genetically-related and non-genetically related speakers with implications for forensic speaker comparison.对具有遗传和非遗传关系的发音人元音共振峰频率的声学分析及其对法庭说话人比较的影响。
PLoS One. 2021 Feb 18;16(2):e0246645. doi: 10.1371/journal.pone.0246645. eCollection 2021.
6
Vowel Formants in Normal and Loud Speech.元音共振峰在正常和大声说话中的变化。
J Speech Lang Hear Res. 2019 May 21;62(5):1278-1295. doi: 10.1044/2018_JSLHR-S-18-0043.
7
Static measurements of vowel formant frequencies and bandwidths: A review.元音共振峰频率和带宽的静态测量:综述。
J Commun Disord. 2018 Jul-Aug;74:74-97. doi: 10.1016/j.jcomdis.2018.05.004. Epub 2018 Jun 1.
8
Acoustic correlates of vowel intelligibility in clear and conversational speech for young normal-hearing and elderly hearing-impaired listeners.正常听力的年轻人和听力受损的老年人在清晰语音和对话语音中元音可懂度的声学相关因素。
J Acoust Soc Am. 2014 Jun;135(6):3570-84. doi: 10.1121/1.4874596.
9
Acoustic analysis of the vocal tract during vowel production by finite-difference time-domain method.通过有限差分时域方法对元音产生过程中声道的声学分析。
J Acoust Soc Am. 2010 Dec;128(6):3724-38. doi: 10.1121/1.3502470.
10
An empirical estimate of the precision of likelihood ratios from a forensic-voice-comparison system.从法医语音比较系统中得出似然比精度的经验估计。
Forensic Sci Int. 2011 May 20;208(1-3):59-65. doi: 10.1016/j.forsciint.2010.11.001. Epub 2010 Dec 4.