Suppr超能文献

声学测量能否预测语音中的性别感知?

Can acoustic measurements predict gender perception in the voice?

机构信息

Department of Human Development, Universidade Estadual de Campinas-UNICAMP, Campinas, Brazil.

Department of Speech and Language Pathology, Federal University of Paraíba-UFPB, João Pessoa, Brazil.

出版信息

PLoS One. 2024 Nov 14;19(11):e0310794. doi: 10.1371/journal.pone.0310794. eCollection 2024.

Abstract

PURPOSE

To determine if there is an association between vocal gender presentation and the gender and context of the listener.

METHOD

Quantitative and transversal study. 47 speakers of Brazilian Portuguese of different genders were recorded. Recordings included sustained vowel emission, connected speech, and the expressive recital of a poem. Subsequently, four scripts were used in Praat to extract 16 acoustic measurements related to prosody. Voices underwent Auditory-Perceptual Assessment (APA) of the gender presentation by 236 people [65 speech and language pathologist (SLP) with experience in the area of the voice (SLP), 101 cisgender people (CG), and 70 transgender and non-binary people (TNB)]. Gender presentation was evaluated by visual analogue scale. Agreement analyses were executed among quantitative variables and multiple linear regression models were generated to predict APA, taking the judge context/gender and speaker gender into consideration.

RESULTS

Acoustic analysis revealed that cis and transgender women had higher median fundamental frequency (fo) values than other genders. Cisgender women exhibited greater breathiness, while cisgender men showed more vocal quality deviations. In terms of APA, significant differences were observed among judge groups: SLP judged vowel samples differently from other groups, and TNB judged speech samples differently (both p<0.001). The predictive measures for the APA varied based on the sample type, speaker gender, and judge group. For vowel samples, only SLP judges had predictive measures (fo and ABI Jitter) for cisgender speakers. In number counting samples, predictive measures for cisgender speakers included fomed and HNR for CG judges, and fomed for both SLP and TNB judges. For transgender and non-binary speakers, predictive measures were fomed for CG and SLP judges, and fomed, CPPs, and ABI for TNB judges. In the poem recital task, predictive measures for cisgender speakers were fomed and HNR for both SLP and CG judges, with additional measures of cvint and sr for CG judges, and fomed, HNR, cvint, and fopeakwidth for TNB judges. For transgender and non-binary speakers, the predictive measures included a wider range of acoustic features such as fomed, fosd, sr, fomin, emph, HNR, Shimmer, and fo peakwidth for SLP judges, and fomed, fosd, sr, fomax, emph, HNR, and Shimmer for CG judges, while TNB judges considered fomed, sr, emph, fosd, Shimmer, HNR, Jitter, and fomax.

CONCLUSIONS

There is an association between the perception of gender presentation in the voice and the gender or context of the listener and the speaker. Transgender and non-binary judges diverged to a higher degree from cisgender and SLP judges. Compared to the evaluation of cisgender speakers, all judge groups used a greater number of acoustic measurements when analyzing the speech of transgender and non-binary individuals in the poem recital samples.

摘要

目的

确定嗓音性别表现与听众的性别和语境之间是否存在关联。

方法

定量和横向研究。记录了来自不同性别的 47 名巴西葡萄牙语说话者的持续性元音发音、连贯言语和富有表现力的诗歌朗诵。随后,使用 Praat 中的四个脚本提取与韵律相关的 16 个声学测量值。由 236 人(65 名具有嗓音领域经验的言语语言病理学家(SLP)、101 名顺性别者(CG)和 70 名跨性别和非二进制者(TNB))进行听觉感知评估(APA)来评估声音的性别表现。性别表现通过视觉模拟量表进行评估。对定量变量进行了一致性分析,并生成了多元线性回归模型,以考虑到法官的性别和语境/说话者的性别来预测 APA。

结果

声学分析表明,顺性别女性和跨性别女性的基频(fo)中位数值高于其他性别。顺性别女性表现出更高的呼吸音,而顺性别男性表现出更多的嗓音质量偏差。在 APA 方面,法官群体之间存在显著差异:SLP 法官对元音样本的判断与其他群体不同,TNB 法官对言语样本的判断也不同(均<0.001)。APA 的预测指标因样本类型、说话者性别和法官群体而异。对于元音样本,只有 SLP 法官对顺性别说话者具有预测指标(fo 和 ABI Jitter)。在数字计数样本中,顺性别说话者的预测指标包括 CG 法官的 fomed 和 HNR,以及 SLP 和 TNB 法官的 fomed。对于跨性别和非二进制说话者,CG 和 SLP 法官的预测指标包括 fomed,而 TNB 法官的预测指标包括 fomed、CPPs 和 ABI。在诗歌朗诵任务中,顺性别说话者的预测指标包括 SLP 和 CG 法官的 fomed 和 HNR,CG 法官的 cvint 和 sr,以及 TNB 法官的 fomed、HNR、cvint 和 fopeakwidth。对于跨性别和非二进制说话者,预测指标包括更广泛的声学特征,如 SLP 法官的 fomed、fosd、sr、fomin、emph、HNR、Shimmer 和 fo peakwidth,以及 CG 法官的 fomed、fosd、sr、 fomax、emph、HNR 和 Shimmer,而 TNB 法官则考虑了 fomed、sr、emph、fosd、Shimmer、HNR、Jitter 和 fomax。

结论

嗓音性别表现的感知与听众和说话者的性别或语境之间存在关联。跨性别和非二进制法官与顺性别和 SLP 法官的分歧更大。与顺性别说话者的评估相比,所有法官群体在分析诗歌朗诵样本中跨性别和非二进制个体的言语时,使用了更多的声学测量值。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验