Suppr超能文献

磁追踪:一种用于无声语音接口的可穿戴舌动跟踪系统。

MagTrack: A Wearable Tongue Motion Tracking System for Silent Speech Interfaces.

机构信息

Department of Electrical and Computer Engineering, The University of Texas at Austin.

Department of Speech, Language, and Hearing Sciences, The University of Texas at Austin.

出版信息

J Speech Lang Hear Res. 2023 Aug 17;66(8S):3206-3221. doi: 10.1044/2023_JSLHR-22-00319. Epub 2023 May 5.

Abstract

PURPOSE

Current electromagnetic tongue tracking devices are not amenable for daily use and thus not suitable for silent speech interface and other applications. We have recently developed MagTrack, a novel wearable electromagnetic articulograph tongue tracking device. This study aimed to validate MagTrack for potential silent speech interface applications.

METHOD

We conducted two experiments: (a) classification of eight isolated vowels in consonant-vowel-consonant form and (b) continuous silent speech recognition. In these experiments, we used data from healthy adult speakers collected with MagTrack. The performance of vowel classification was measured by accuracies. The continuous silent speech recognition was measured by phoneme error rates. The performance was then compared with results using data collected with commercial electromagnetic articulograph in a prior study.

RESULTS

The isolated vowel classification using MagTrack achieved an average accuracy of 89.74% when leveraging all MagTrack signals (, , coordinates; orientation; and magnetic signals), which outperformed the accuracy using commercial electromagnetic articulograph data (only , coordinates) in our previous study. The continuous speech recognition from two subjects using MagTrack achieved phoneme error rates of 73.92% and 66.73%, respectively. The commercial electromagnetic articulograph achieved 64.53% from the same subject (66.73% using MagTrack data).

CONCLUSIONS

MagTrack showed comparable results with the commercial electromagnetic articulograph when using the same localized information. Adding raw magnetic signals would improve the performance of MagTrack. Our preliminary testing demonstrated the potential for silent speech interface as a lightweight wearable device. This work also lays the foundation to support MagTrack's potential for other applications including visual feedback-based speech therapy and second language learning.

摘要

目的

当前的电磁舌跟踪设备不适于日常使用,因此不适合用于无声语音接口和其他应用。我们最近开发了 MagTrack,这是一种新型可穿戴电磁发音器舌跟踪设备。本研究旨在验证 MagTrack 用于潜在无声语音接口应用的能力。

方法

我们进行了两项实验:(a)以辅音-元音-辅音形式分类八个孤立元音,(b)连续无声语音识别。在这些实验中,我们使用 MagTrack 采集的健康成年说话者的数据。元音分类的性能通过准确率进行衡量。连续无声语音识别的性能通过音素错误率进行衡量。然后将性能与之前使用商业电磁发音器数据进行的研究结果进行比较。

结果

使用 MagTrack 时,通过利用所有 MagTrack 信号(,,坐标;方向;和磁信号),孤立元音分类的平均准确率为 89.74%,优于我们之前研究中使用商业电磁发音器数据(仅,坐标)的准确率。两名受试者使用 MagTrack 进行连续语音识别的音素错误率分别为 73.92%和 66.73%。同一名受试者使用商业电磁发音器的准确率为 64.53%(使用 MagTrack 数据为 66.73%)。

结论

当使用相同的局部信息时,MagTrack 与商业电磁发音器的结果相当。添加原始磁信号将提高 MagTrack 的性能。我们的初步测试表明,作为一种轻便可穿戴设备,MagTrack 具有用于无声语音接口的潜力。这项工作还为 MagTrack 支持其他应用(包括基于视觉反馈的言语治疗和第二语言学习)奠定了基础。

相似文献

1
MagTrack: A Wearable Tongue Motion Tracking System for Silent Speech Interfaces.磁追踪:一种用于无声语音接口的可穿戴舌动跟踪系统。
J Speech Lang Hear Res. 2023 Aug 17;66(8S):3206-3221. doi: 10.1044/2023_JSLHR-22-00319. Epub 2023 May 5.
2
Speakers are able to categorize vowels based on tongue somatosensation.说话者能够基于舌体感来对元音进行分类。
Proc Natl Acad Sci U S A. 2020 Mar 17;117(11):6255-6263. doi: 10.1073/pnas.1911142117. Epub 2020 Mar 2.
4
Articulatory distinctiveness of vowels and consonants: a data-driven approach.元音和辅音的发音区别:一种数据驱动的方法。
J Speech Lang Hear Res. 2013 Oct;56(5):1539-51. doi: 10.1044/1092-4388(2013/12-0030). Epub 2013 Jul 9.
6
On the lingual organization of the German vowel system.论德语元音系统的舌位组织
J Acoust Soc Am. 1999 Aug;106(2):1020-32. doi: 10.1121/1.428053.
8
Kinematic and Acoustic Changes to Vowels and Diphthongs in Bite Block Speech.闭口位元音和双元音的运动学和声学变化。
J Speech Lang Hear Res. 2021 Jun 4;64(6):1794-1801. doi: 10.1044/2021_JSLHR-20-00630. Epub 2021 May 12.

引用本文的文献

本文引用的文献

8
Editorial: Models and Theories of Speech Production.社论:言语产生的模型与理论
Front Psychol. 2020 Jun 19;11:1238. doi: 10.3389/fpsyg.2020.01238. eCollection 2020.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验