Goudarzi Alireza, Moya-Galé Gemma
Factorize, Tokyo, Japan.
Department of Communication Sciences and Disorders, Long Island University, Brooklyn, NY, United States.
Front Artif Intell. 2021 Dec 22;4:809321. doi: 10.3389/frai.2021.809321. eCollection 2021.
The sophistication of artificial intelligence (AI) technologies has significantly advanced in the past decade. However, the observed unpredictability and variability of AI behavior in noisy signals is still underexplored and represents a challenge when trying to generalize AI behavior to real-life environments, especially for people with a speech disorder, who already experience reduced speech intelligibility. In the context of developing assistive technology for people with Parkinson's disease using automatic speech recognition (ASR), this pilot study reports on the performance of Google Cloud speech-to-text technology with dysarthric and healthy speech in the presence of multi-talker babble noise at different intensity levels. Despite sensitivities and shortcomings, it is possible to control the performance of these systems with current tools in order to measure speech intelligibility in real-life conditions.
在过去十年中,人工智能(AI)技术的复杂性有了显著提升。然而,在嘈杂信号中观察到的人工智能行为的不可预测性和变异性仍未得到充分探索,并且在试图将人工智能行为推广到现实生活环境时构成了挑战,尤其是对于那些已经经历语音清晰度下降的言语障碍患者。在使用自动语音识别(ASR)为帕金森病患者开发辅助技术的背景下,这项试点研究报告了谷歌云语音转文本技术在不同强度水平的多说话者嘈杂噪声环境中对构音障碍语音和正常语音的识别性能。尽管存在敏感性和缺点,但使用当前工具控制这些系统的性能以测量现实生活条件下的语音清晰度是可能的。