MIFT Department, University Of Messina, Italy; Campus Bio Medico, University Of Rome, Italy.
Speech Language Pathologist, Messina, Italy.
Comput Biol Med. 2022 Sep;148:105864. doi: 10.1016/j.compbiomed.2022.105864. Epub 2022 Jul 12.
Nowadays, many application scenarios benefit from automatic speech recognition (ASR) technology. Within the field of speech therapy, in some cases ASR is exploited in the treatment of dysarthria with the aim of supporting articulation output. However, in presence of atypical speech, standard ASR approaches do not provide any reliable result in terms of voice recognition due to main issues, including: (i) the extreme intra and inter-speakers variability of the speech in presence of speech impairments, such as dysarthria; (ii) the absence of dedicated corpora containing voice samples from users with a speech disability to train a state-of-the-art speech model, particularly in non-English languages. In this paper, we focus on isolated word recognition for native Italian speakers with dysarthria and we exploit an existing mobile app to collect audio data from users with speech disorders while they perform articulation exercises for speech therapy purposes. With this data availability, a convolutional neural network has been trained to spot a small number of keywords within atypical speech, according to a speaker dependent method. Finally, we discuss the benefits of the trained ASR system in tailored telerehabilitation contexts intended for patients with dysarthria who can follow treatment plans under the supervision of remote speech language pathologists.
如今,许多应用场景都受益于自动语音识别(ASR)技术。在言语治疗领域,在某些情况下,ASR 被用于治疗构音障碍,以支持发音输出。然而,在存在非典型语音的情况下,由于主要问题,标准的 ASR 方法在语音识别方面无法提供任何可靠的结果,这些问题包括:(i)在存在语音障碍(如构音障碍)的情况下,语音的极端内和跨说话者变异性;(ii)缺乏包含来自言语障碍用户的语音样本的专用语料库来训练最先进的语音模型,特别是在非英语语言中。在本文中,我们专注于母语为意大利语的构音障碍患者的孤立单词识别,并利用现有的移动应用程序从言语障碍患者那里收集音频数据,当他们为言语治疗目的进行发音练习时。有了这些数据的可用性,我们根据特定说话者的方法,使用卷积神经网络来识别非典型语音中的少量关键字。最后,我们讨论了经过训练的 ASR 系统在专门为构音障碍患者设计的远程康复环境中的优势,这些患者可以在远程言语语言病理学家的监督下遵循治疗计划。