自动语音测听：使用开源预训练的 Kaldi-NL 自动语音识别是否可行？

Automated Speech Audiometry: Can It Work Using Open-Source Pre-Trained Kaldi-NL Automatic Speech Recognition?

机构信息

Department of Otorhinolaryngology, Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands.

W.J. Kolff Institute for Biomedical Engineering and Materials Science, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands.

出版信息

Trends Hear. 2024 Jan-Dec;28:23312165241229057. doi: 10.1177/23312165241229057.

DOI:10.1177/23312165241229057

PMID:38483979

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10943752/

Abstract

A practical speech audiometry tool is the digits-in-noise (DIN) test for hearing screening of populations of varying ages and hearing status. The test is usually conducted by a human supervisor (e.g., clinician), who scores the responses spoken by the listener, or online, where software scores the responses entered by the listener. The test has 24-digit triplets presented in an adaptive staircase procedure, resulting in a speech reception threshold (SRT). We propose an alternative automated DIN test setup that can evaluate spoken responses whilst conducted without a human supervisor, using the open-source automatic speech recognition toolkit, Kaldi-NL. Thirty self-reported normal-hearing Dutch adults (19-64 years) completed one DIN + Kaldi-NL test. Their spoken responses were recorded and used for evaluating the transcript of decoded responses by Kaldi-NL. Study 1 evaluated the Kaldi-NL performance through its word error rate (WER), percentage of summed decoding errors regarding only digits found in the transcript compared to the total number of digits present in the spoken responses. Average WER across participants was 5.0% (range 0-48%, SD = 8.8%), with average decoding errors in three triplets per participant. Study 2 analyzed the effect that triplets with decoding errors from Kaldi-NL had on the DIN test output (SRT), using bootstrapping simulations. Previous research indicated 0.70 dB as the typical within-subject SRT variability for normal-hearing adults. Study 2 showed that up to four triplets with decoding errors produce SRT variations within this range, suggesting that our proposed setup could be feasible for clinical applications.

摘要

一种实用的言语测听工具是数字噪声测试（DIN），用于对不同年龄和听力状况的人群进行听力筛查。该测试通常由人类主管（例如临床医生）进行，由主管对听力者的反应进行评分，或者在线进行，由软件对听力者输入的反应进行评分。测试有 24 位数字的三胞胎，采用自适应阶梯程序呈现，得出言语接受阈（SRT）。我们提出了一种替代的自动化 DIN 测试设置，该设置可以在没有人类主管的情况下评估口语反应，使用开源的自动语音识别工具包 Kaldi-NL。30 名自我报告的荷兰正常听力成年人（19-64 岁）完成了一次 DIN+Kaldi-NL 测试。他们的口语反应被记录下来，并用于评估 Kaldi-NL 解码反应的转录本。研究 1 通过其单词错误率（WER）评估 Kaldi-NL 的性能，即相对于转录本中出现的总数字，解码错误的数字百分比与口语反应中出现的总数字相比。参与者的平均 WER 为 5.0%（范围为 0-48%，SD=8.8%），平均每个参与者有三个三胞胎的解码错误。研究 2 使用 bootstrap 模拟分析了 Kaldi-NL 解码错误的三胞胎对 DIN 测试输出（SRT）的影响。先前的研究表明，0.70dB 是正常听力成年人的典型 SRT 内个体差异。研究 2 表明，多达四个有解码错误的三胞胎会产生在此范围内的 SRT 变化，这表明我们提出的设置可能适用于临床应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7725/10943752/e39831226994/10.1177_23312165241229057-fig1.jpg

相似文献

Automated Speech Audiometry: Can It Work Using Open-Source Pre-Trained Kaldi-NL Automatic Speech Recognition?自动语音测听：使用开源预训练的 Kaldi-NL 自动语音识别是否可行？

Trends Hear. 2024 Jan-Dec;28:23312165241229057. doi: 10.1177/23312165241229057.

Speech Recognition in Noise Using Binaural Diotic and Antiphasic Digits-in-Noise in Children: Maturation and Self-Test Validity.使用双耳同音和反相噪声数字测试法在噪声环境中对儿童进行语音识别：成熟度与自我测试效度

J Am Acad Audiol. 2021 May;32(5):315-323. doi: 10.1055/s-0041-1727274. Epub 2021 Aug 10.

A comparison between the Dutch and American-English digits-in-noise (DIN) tests in normal-hearing listeners.正常听力受试者中荷兰语和美式英语数字噪声测试（DIN）的比较。

Int J Audiol. 2016;55(6):358-65. doi: 10.3109/14992027.2015.1137362. Epub 2016 Mar 4.

10-Year Follow-Up Results of The Netherlands Longitudinal Study on Hearing: Trends of Longitudinal Change in Speech Recognition in Noise.荷兰听力纵向研究 10 年随访结果：噪声中言语识别的纵向变化趋势。

Ear Hear. 2020 May/Jun;41(3):491-499. doi: 10.1097/AUD.0000000000000780.

Development and validation of a smartphone-based digits-in-noise hearing test in South African English.基于智能手机的南非英语噪声中数字听力测试的开发与验证

Int J Audiol. 2015 Jul;55(7):405-11. doi: 10.3109/14992027.2016.1172269. Epub 2016 Apr 28.

Digits in noise testing in a multilingual sample of Asian adults.噪声测试中亚洲成年人多语言样本中的数字。

Int J Audiol. 2024 Apr;63(4):269-274. doi: 10.1080/14992027.2023.2179549. Epub 2023 Feb 27.

Speech Recognition Abilities in Normal-Hearing Children 4 to 12 Years of Age in Stationary and Interrupted Noise.正常听力儿童在静止和间断噪声中的语音识别能力：4 至 12 岁。

Ear Hear. 2018 Nov/Dec;39(6):1091-1103. doi: 10.1097/AUD.0000000000000569.

Development of a Smartphone-Based Digits-in-Noise Test in Korean: a Hearing Screening Tool for Speech Perception in Noise.基于智能手机的韩国数字噪声测试的开发：一种用于噪声下言语感知的听力筛查工具。

J Korean Med Sci. 2020 Jun 1;35(21):e163. doi: 10.3346/jkms.2020.35.e163.

Automated screening for high-frequency hearing loss.高频听力损失的自动筛查

Ear Hear. 2014 Nov-Dec;35(6):667-79. doi: 10.1097/AUD.0000000000000073.

The digits-in-noise test: assessing auditory speech recognition abilities in noise.噪声中的数字测试：评估在噪声环境下的听觉言语识别能力。

J Acoust Soc Am. 2013 Mar;133(3):1693-706. doi: 10.1121/1.4789933.

引用本文的文献

Automated Speech Intelligibility Assessment Using AI-Based Transcription in Children with Cochlear Implants, Hearing Aids, and Normal Hearing.使用基于人工智能转录的自动语音清晰度评估在植入人工耳蜗、佩戴助听器及听力正常的儿童中进行研究

J Clin Med. 2025 Jul 25;14(15):5280. doi: 10.3390/jcm14155280.

Automatic development of speech-in-noise hearing tests using machine learning.利用机器学习自动开展噪声环境下言语听力测试

Sci Rep. 2025 Apr 15;15(1):12878. doi: 10.1038/s41598-025-96312-z.

Automating Speech Audiometry in Quiet and in Noise Using a Deep Neural Network.使用深度神经网络实现安静和噪声环境下言语测听的自动化

Biology (Basel). 2025 Feb 12;14(2):191. doi: 10.3390/biology14020191.

Evaluating speech-in-speech perception via a humanoid robot.通过人形机器人评估语音中语音的感知。

Front Neurosci. 2024 Feb 9;18:1293120. doi: 10.3389/fnins.2024.1293120. eCollection 2024.

本文引用的文献

Using Automatic Speech Recognition to Optimize Hearing-Aid Time Constants.使用自动语音识别优化助听器时间常数。

Front Neurosci. 2022 Mar 17;16:779062. doi: 10.3389/fnins.2022.779062. eCollection 2022.

OPRA-RS: A Hearing-Aid Fitting Method Based on Automatic Speech Recognition and Random Search.OPRA-RS：一种基于自动语音识别和随机搜索的助听器验配方法。

Front Neurosci. 2022 Feb 21;16:779048. doi: 10.3389/fnins.2022.779048. eCollection 2022.

Preliminary Evaluation of Automated Speech Recognition Apps for the Hearing Impaired and Deaf.针对听力受损和失聪人士的自动语音识别应用程序的初步评估

Front Digit Health. 2022 Feb 16;4:806076. doi: 10.3389/fdgth.2022.806076. eCollection 2022.

The digit triplet test: a scoping review.数字三联体测试：范围综述。

Int J Audiol. 2021 Dec;60(12):946-963. doi: 10.1080/14992027.2021.1902579. Epub 2021 Apr 11.

Speech Audiometry at Home: Automated Listening Tests via Smart Speakers With Normal-Hearing and Hearing-Impaired Listeners.居家语音测听：正常听力和听力障碍者通过智能音箱进行自动化听力测试。

Trends Hear. 2020 Jan-Dec;24:2331216520970011. doi: 10.1177/2331216520970011.

Aging voice and the laryngeal muscle atrophy.衰老嗓音与喉肌萎缩。

Laryngoscope. 2015 Nov;125(11):2518-21. doi: 10.1002/lary.25398. Epub 2015 Jul 7.

The digits-in-noise test: assessing auditory speech recognition abilities in noise.噪声中的数字测试：评估在噪声环境下的听觉言语识别能力。

J Acoust Soc Am. 2013 Mar;133(3):1693-706. doi: 10.1121/1.4789933.

Internationally comparable screening tests for listening in noise in several European languages: the German digit triplet test as an optimization prototype.几种欧洲语言在噪声环境下听力筛查测试的国际可比性：作为优化原型的德语数字三音测试。

Int J Audiol. 2012 Sep;51(9):697-707. doi: 10.3109/14992027.2012.690078. Epub 2012 Jul 4.

Sentence intelligibility in noise for listeners with normal hearing and hearing impairment: influence of measurement procedure and masking parameters.听力正常和听力受损的听众在噪声中的句子可懂度：测量程序和掩蔽参数的影响。

Int J Audiol. 2005 Mar;44(3):144-56. doi: 10.1080/14992020500057517.

Development and validation of an automatic speech-in-noise screening test by telephone.电话自动噪声环境下言语筛查测试的开发与验证

Int J Audiol. 2004 Jan;43(1):15-28. doi: 10.1080/14992020400050004.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

自动语音测听：使用开源预训练的 Kaldi-NL 自动语音识别是否可行？

Automated Speech Audiometry: Can It Work Using Open-Source Pre-Trained Kaldi-NL Automatic Speech Recognition?

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献