Suppr超能文献

嘿,Siri:常见语音识别系统在识别嗓音障碍者的声音方面效果如何?

Hey Siri: How Effective are Common Voice Recognition Systems at Recognizing Dysphonic Voices?

作者信息

Rohlfing Matthew L, Buckley Daniel P, Piraquive Jacquelyn, Stepp Cara E, Tracy Lauren F

机构信息

Department of Otolaryngology-Head and Neck Surgery, Boston Medical Center Boston University School of Medicine, Boston, Massachusetts, U.S.A.

Department of Speech, Language, and Hearing Sciences, Boston University, Boston, Massachusetts, U.S.A.

出版信息

Laryngoscope. 2021 Jul;131(7):1599-1607. doi: 10.1002/lary.29082. Epub 2020 Sep 19.

Abstract

OBJECTIVES/HYPOTHESIS: Interaction with voice recognition systems, such as Siri™ and Alexa™, is an increasingly important part of everyday life. Patients with voice disorders may have difficulty with this technology, leading to frustration and reduction in quality of life. This study evaluates the ability of common voice recognition systems to transcribe dysphonic voices.

STUDY DESIGN

Retrospective evaluation of "Rainbow Passage" voice samples from patients with and without voice disorders.

METHODS

Participants with (n = 30) and without (n = 23) voice disorders were recorded reading the "Rainbow Passage". Recordings were played at standardized intensity and distance-to-dictation programs on Apple iPhone 6S™, Apple iPhone 11 Pro™, and Google Voice™. Word recognition scores were calculated as the proportion of correctly transcribed words. Word recognition scores were compared to auditory-perceptual and acoustic measures.

RESULTS

Mean word recognition scores for participants with and without voice disorders were, respectively, 68.6% and 91.9% for Apple iPhone 6S™ (P < .001), 71.2% and 93.7% for Apple iPhone 11 Pro™ (P < .001), and 68.7% and 93.8% for Google Voice™ (P < .001). There were strong, approximately linear associations between CAPE-V ratings of overall severity of dysphonia and word recognition score, with correlation coefficients (R ) of 0.609 (iPhone 6S™), 0.670 (iPhone 11 Pro™), and 0.619 (Google Voice™). These relationships persisted when controlling for diagnosis, age, gender, fundamental frequency, and speech rate (P < .001 for all systems).

CONCLUSION

Common voice recognition systems function well with nondysphonic voices but are poor at accurately transcribing dysphonic voices. There was a strong negative correlation with word recognition scores and perceptual voice evaluation. As our society increasingly interfaces with automated voice recognition technology, the needs of patients with voice disorders should be considered.

LEVEL OF EVIDENCE

4 Laryngoscope, 131:1599-1607, 2021.

摘要

目的/假设:与语音识别系统(如Siri™和Alexa™)的交互在日常生活中变得越来越重要。语音障碍患者在使用这项技术时可能会遇到困难,从而导致沮丧情绪并降低生活质量。本研究评估了常见语音识别系统转录嗓音障碍患者语音的能力。

研究设计

对有和没有语音障碍患者的“彩虹段落”语音样本进行回顾性评估。

方法

记录了有语音障碍的参与者(n = 30)和无语音障碍的参与者(n = 23)朗读“彩虹段落”的情况。录音在苹果iPhone 6S™、苹果iPhone 11 Pro™和谷歌语音™上以标准化强度和距离听写程序播放。单词识别分数以正确转录单词的比例计算。将单词识别分数与听觉感知和声学测量结果进行比较。

结果

对于苹果iPhone 6S™,有语音障碍和无语音障碍参与者的平均单词识别分数分别为68.6%和91.9%(P <.001);对于苹果iPhone 11 Pro™,分别为71.2%和93.7%(P <.001);对于谷歌语音™,分别为68.7%和93.8%(P <.001)。嗓音障碍总体严重程度的CAPE-V评分与单词识别分数之间存在强且近似线性的关联,相关系数(R)分别为0.609(iPhone 6S™)、0.670(iPhone 11 Pro™)和0.619(谷歌语音™)。在控制诊断、年龄、性别、基频和语速后,这些关系依然存在(所有系统的P均<.001)。

结论

常见语音识别系统对无嗓音障碍的语音功能良好,但在准确转录嗓音障碍语音方面表现不佳。单词识别分数与感知语音评估之间存在很强的负相关性。随着我们的社会越来越多地与自动语音识别技术交互,应考虑语音障碍患者的需求。

证据水平

4 《喉镜》,131:1599 - 1607,2021年。

相似文献

6
Use of cepstral analysis for differentiating dysphonic from normal voices in children.运用倒谱分析鉴别儿童嗓音障碍与正常嗓音。
Int J Pediatr Otorhinolaryngol. 2019 Jan;116:107-113. doi: 10.1016/j.ijporl.2018.10.029. Epub 2018 Oct 23.
8
Evaluating iPhone recordings for acoustic voice assessment.评估用于声学语音评估的iPhone录音。
Folia Phoniatr Logop. 2012;64(3):122-30. doi: 10.1159/000335874. Epub 2012 May 15.
9
Validation of the Acoustic Voice Quality Index in the Lithuanian Language.立陶宛语声学语音质量指数的验证。
J Voice. 2017 Mar;31(2):257.e1-257.e11. doi: 10.1016/j.jvoice.2016.06.002. Epub 2016 Jul 15.

本文引用的文献

2
The Effect of Background Noise on Intelligibility of Dysphonic Speech.背景噪声对嗓音障碍言语清晰度的影响。
J Speech Lang Hear Res. 2017 Jul 12;60(7):1919-1929. doi: 10.1044/2017_JSLHR-S-16-0012.
9
The prevalence of voice problems among adults in the United States.美国成年人嗓音问题的患病率。
Laryngoscope. 2014 Oct;124(10):2359-62. doi: 10.1002/lary.24740. Epub 2014 May 27.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验