• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于性别确认嗓音护理的人工智能辅助治疗结果测量的验证:将人工智能的准确性与听众对嗓音女性化的感知进行比较。

Validation of an AI-assisted Treatment Outcome Measure for Gender-Affirming Voice Care: Comparing AI Accuracy to Listener's Perception of Voice Femininity.

作者信息

Simon Shane, Silverstein Einav, Timmons-Sund Lauren, Pinto Jeremy M, Castro Eugenia M, O'Dell Karla, Johns Iii Michael M, Mack Wendy J, Bensoussan Yael

机构信息

Departement of Otolaryngology-Head & Neck Surgery, Keck School of Medicine, University of Southern California, Los Angeles, California.

Caruso Department of Otolaryngology, Head and Neck Surgery, Keck School of Medicine, University of Southern California, Los Angeles, California.

出版信息

J Voice. 2023 Dec 29. doi: 10.1016/j.jvoice.2023.12.008.

DOI:10.1016/j.jvoice.2023.12.008
PMID:38158296
Abstract

OBJECTIVES

There is currently a lack of objective treatment outcome measures for transgender individuals undergoing gender-affirming voice care. Recently, Bensoussan et al developed an AI model that is able to generate a voice femininity rating based on a short voice sample provided through a smartphone application. The purpose of this study was to examine the feasibility of using this model as a treatment outcome measure by comparing its performance to human listeners. Additionally, we examined the effect of two different training datasets on the model's accuracy and performance when presented with external data.

METHODS

100 voice recordings from 50 cisgender males and 50 cisgender females were retrospectively collected from patients presenting at a university voice clinic for reasons other than dysphonia. The recordings were evaluated by expert and naïve human listeners, who rated each voice based on how sure they were the voice belonged to a female speaker (% voice femininity [R]). Human ratings were compared to ratings generated by (1) the AI model trained on a high-quality low-quantity dataset (voices from the Perceptual Voice Quality Database) (PVQD model), and (2) the AI model trained on a low-quality high-quantity dataset (voices from the Mozilla Common Voice database) (Mozilla model). Ambiguity scores were calculated as the absolute value of the difference between the rating and certainty (0 or 100%).

RESULTS

Both expert and naïve listeners achieved 100% accuracy in identifying voice gender based on a binary classification (female >50% voice femininity [R]). In comparison, the Mozilla-trained model achieved 92% accuracy and the previously published PVQD model achieved 84% accuracy in determining voice gender (female >50% AI voice femininity). While both AI models correlated with human ratings, the Mozilla-trained model showed a stronger correlation as well as lower overall rating ambiguity than the PVQD-trained model. The Mozilla model also appeared to handle pitch information in a similar way to human raters.

CONCLUSIONS

The AI model predicted voice gender with high accuracy when compared to human listeners and has potential as a useful outcome measure for transgender individuals receiving gender-affirming voice training. The Mozilla-trained model performed better than the PVQD-trained model, indicating that for binary classification tasks, the quantity of data may influence accuracy more than the quality of the data used for training the voice AI models.

摘要

目的

目前,对于接受性别肯定性嗓音治疗的跨性别者,缺乏客观的治疗效果评估指标。最近,本苏桑等人开发了一种人工智能模型,该模型能够根据通过智能手机应用程序提供的简短语音样本生成嗓音女性化评分。本研究的目的是通过将该模型的性能与人类听众的表现进行比较,检验使用该模型作为治疗效果评估指标的可行性。此外,我们还研究了两种不同训练数据集在面对外部数据时对模型准确性和性能的影响。

方法

从一所大学嗓音诊所因非发声困难原因就诊的患者中,回顾性收集了50名顺性别男性和50名顺性别女性的100份语音记录。这些记录由专业和非专业的人类听众进行评估,他们根据对语音属于女性说话者的确定程度对每个语音进行评分(%嗓音女性化[R])。将人类评分与以下两种模型生成的评分进行比较:(1)在高质量低数量数据集(来自感知语音质量数据库的语音)上训练的人工智能模型(PVQD模型),以及(2)在低质量高数量数据集(来自Mozilla通用语音数据库的语音)上训练的人工智能模型(Mozilla模型)。模糊度得分计算为评分与确定性(0或100%)之间差值的绝对值。

结果

在基于二元分类(女性>50%嗓音女性化[R])识别语音性别方面,专业和非专业听众的准确率均达到100%。相比之下,Mozilla训练的模型在确定语音性别(女性>50%人工智能嗓音女性化)方面的准确率为92%,先前发表的PVQD模型的准确率为84%。虽然两种人工智能模型都与人类评分相关,但Mozilla训练的模型显示出更强的相关性,并且总体评分模糊度低于PVQD训练的模型。Mozilla模型在处理音高信息方面似乎也与人类评分者类似。

结论

与人类听众相比,人工智能模型在预测语音性别方面具有较高的准确性,并且有潜力作为接受性别肯定性嗓音训练的跨性别者的有用治疗效果评估指标。Mozilla训练的模型比PVQD训练的模型表现更好,这表明对于二元分类任务,数据的数量可能比用于训练语音人工智能模型的数据质量对准确性的影响更大。

相似文献

1
Validation of an AI-assisted Treatment Outcome Measure for Gender-Affirming Voice Care: Comparing AI Accuracy to Listener's Perception of Voice Femininity.用于性别确认嗓音护理的人工智能辅助治疗结果测量的验证:将人工智能的准确性与听众对嗓音女性化的感知进行比较。
J Voice. 2023 Dec 29. doi: 10.1016/j.jvoice.2023.12.008.
2
Perception of Femininity and Masculinity in Voices as Rated by Transgender and Gender Diverse People, Professional Speech and Language Pathologists, and Cisgender Naive Listeners.跨性别者、性别多样化者、专业言语和语言病理学家以及顺性别普通听众对声音中女性气质和男性气质的感知。
J Voice. 2024 Aug 22. doi: 10.1016/j.jvoice.2024.07.034.
3
Deep Learning for Voice Gender Identification: Proof-of-concept for Gender-Affirming Voice Care.深度学习在语音性别识别中的应用:用于性别肯定型嗓音护理的概念验证。
Laryngoscope. 2021 May;131(5):E1611-E1615. doi: 10.1002/lary.29281. Epub 2020 Nov 21.
4
Perceived Gender and Client Satisfaction in Transgender Voice Work: Comparing Self and Listener Rating Scales across a Training Program.跨性别者嗓音工作中的感知性别与客户满意度:培训计划中自我评估和听众评估量表的比较。
Folia Phoniatr Logop. 2022;74(5):364-379. doi: 10.1159/000521226. Epub 2021 Nov 30.
5
Examining the voice of Israeli transgender women: Acoustic measures, voice femininity and voice-related quality-of-life.审视以色列跨性别女性的声音:声学测量、声音女性化及与声音相关的生活质量。
Int J Transgend Health. 2020 Aug 7;22(3):281-293. doi: 10.1080/26895269.2020.1798838. eCollection 2021.
6
Acoustic Features of Transfeminine Voices and Perceptions of Voice Femininity. transgender 声音的声学特征与声音女性特质的感知。
J Voice. 2020 Nov;34(6):961.e19-961.e26. doi: 10.1016/j.jvoice.2019.05.012. Epub 2019 Jun 13.
7
Acoustic Predictors of Gender Attribution, Masculinity-Femininity, and Vocal Naturalness Ratings Amongst Transgender and Cisgender Speakers.跨性别者和顺性别者中性别归因、男性气质-女性气质及嗓音自然度评分的声学预测因素
J Voice. 2020 Mar;34(2):300.e11-300.e26. doi: 10.1016/j.jvoice.2018.10.002. Epub 2018 Nov 28.
8
A Comparison of an Artificial Intelligence Tool to Fundamental Frequency as an Outcome Measure in People Seeking a More Feminine Voice.在寻求更具女性化嗓音的人群中,将一种人工智能工具与基频作为结果测量指标进行比较。
Laryngoscope. 2021 Nov;131(11):2567-2571. doi: 10.1002/lary.29605. Epub 2021 May 11.
9
Associations Between Voice and Gestural Characteristics of Transgender Women and Self-Rated Femininity, Satisfaction, and Quality of Life.跨性别女性的声音和手势特征与自我评估的女性气质、满意度和生活质量之间的关联。
Am J Speech Lang Pathol. 2021 Mar 26;30(2):663-672. doi: 10.1044/2020_AJSLP-20-00118. Epub 2021 Mar 1.
10
Gender-Affirming Voice Training for Trans Women: Acoustic Outcomes and Their Associations With Listener Perceptions Related to Gender.为跨性别女性提供的性别肯定性嗓音训练:声学结果及其与听众性别认知的关联
J Voice. 2024 Mar 18. doi: 10.1016/j.jvoice.2024.02.003.

引用本文的文献

1
Laryngeal dystonia and vocal tremor response to botulinum toxin injection.喉肌张力障碍和声带震颤对肉毒杆菌毒素注射的反应。
Eur Arch Otorhinolaryngol. 2025 Feb;282(2):919-926. doi: 10.1007/s00405-024-09111-z. Epub 2024 Dec 7.