Sebkhi Nordine, Desai Dhyey, Islam Mohammad, Lu Jun, Wilson Kimberly, Ghovanloo Maysam
IEEE Trans Biomed Eng. 2017 Nov;64(11):2639-2649. doi: 10.1109/TBME.2017.2654361. Epub 2017 Jan 18.
Speech-language pathologists (SLPs) are trained to correct articulation of people diagnosed with motor speech disorders by analyzing articulators' motion and assessing speech outcome while patients speak. To assist SLPs in this task, we are presenting the multimodal speech capture system (MSCS) that records and displays kinematics of key speech articulators, the tongue and lips, along with voice, using unobtrusive methods. Collected speech modalities, tongue motion, lips gestures, and voice are visualized not only in real-time to provide patients with instant feedback but also offline to allow SLPs to perform post-analysis of articulators' motion, particularly the tongue, with its prominent but hardly visible role in articulation. We describe the MSCS hardware and software components, and demonstrate its basic visualization capabilities by a healthy individual repeating the words "Hello World." A proof-of-concept prototype has been successfully developed for this purpose, and will be used in future clinical studies to evaluate its potential impact on accelerating speech rehabilitation by enabling patients to speak naturally. Pattern matching algorithms to be applied to the collected data can provide patients with quantitative and objective feedback on their speech performance, unlike current methods that are mostly subjective, and may vary from one SLP to another.
言语语言病理学家(SLP)经过培训,通过分析发音器官的运动并在患者说话时评估言语结果,来纠正被诊断患有运动性言语障碍的人的发音。为了协助SLP完成这项任务,我们展示了多模态语音捕捉系统(MSCS),该系统使用不引人注意的方法记录并显示关键言语发音器官(舌头和嘴唇)的运动学以及语音。收集到的语音模态、舌头运动、嘴唇动作和语音不仅会实时可视化,为患者提供即时反馈,还会离线可视化,以便SLP对发音器官的运动,特别是舌头在发音中突出但难以看见的作用进行事后分析。我们描述了MSCS的硬件和软件组件,并通过一个健康个体重复“Hello World”这句话来展示其基本可视化功能。为此已经成功开发了一个概念验证原型,未来将用于临床研究,以评估其通过使患者能够自然说话来加速言语康复方面的潜在影响。与目前大多主观且可能因不同SLP而异的方法不同,应用于收集数据的模式匹配算法可以为患者提供关于其言语表现的定量和客观反馈。