Rameau Anaïs
Department of Otolaryngology-Head and Neck Surgery, Weill Cornell Medical College, Sean Parker Institute for the Voice, New York, New York.
Head Neck. 2020 May;42(5):839-845. doi: 10.1002/hed.26057. Epub 2019 Dec 26.
The main modalities for voice restoration after laryngectomy are the electrolarynx, and the tracheoesophageal puncture [Correction added on 30 January 2020 after first online publication: The preceding sentence has been revised. It originally read "The main modalities for voice restoration after laryngectomy are the electrolarynx and the tracheoesophageal puncture."]. All have limitations and new technologies may offer innovative alternatives via silent speech.
To describe a novel and personalized method of voice restoration using machine learning applied to electromyographic signal from articulatory muscles for the recognition of silent speech in a patient with total laryngectomy.
Surface electromyographic (sEMG) signals of articulatory muscles were recorded from the face and neck of a patient with total laryngectomy who was articulating words silently. These sEMG signals were then used for automatic speech recognition via machine learning. Sensor placement was tailored to the patient's unique anatomy, following radiation and surgery. A personalized wearable mask covering the sensors was designed using 3D scanning and 3D printing.
Using seven sEMG sensors on the patient's face and neck and two grounding electrodes, we recorded EMG data while he was mouthing "Tedd" and "Ed." With data from 75 utterances for each of these words, we discriminated the sEMG signal with 86.4% accuracy using an XGBoost machine-learning model.
This pilot study demonstrates the feasibility of sEMG-based alaryngeal speech recognition, using tailored sensor placement and a personalized wearable device. Further refinement of this approach could allow translation of silently articulated speech into a synthesized voiced speech via portable devices.
喉切除术后语音恢复的主要方式是电子喉和气管食管穿刺[2020年1月30日首次在线发表后添加的更正:前一句已修订。原句为“The main modalities for voice restoration after laryngectomy are the electrolarynx and the tracheoesophageal puncture.”]。所有这些方式都有局限性,而新技术可能通过无声语音提供创新的替代方案。
描述一种新颖的个性化语音恢复方法,该方法利用机器学习对发音肌肉的肌电信号进行分析,以识别全喉切除患者的无声语音。
从一名全喉切除患者的面部和颈部记录发音肌肉的表面肌电(sEMG)信号,该患者正在无声地说出单词。然后,这些sEMG信号通过机器学习用于自动语音识别。根据患者在放疗和手术后独特的解剖结构来调整传感器的放置位置。使用3D扫描和3D打印设计了一个覆盖传感器的个性化可穿戴面罩。
在患者的面部和颈部使用七个sEMG传感器和两个接地电极,当他默念“Tedd”和“Ed”时,我们记录了肌电数据。利用这两个单词各自75次发声的数据,我们使用XGBoost机器学习模型以86.4%的准确率区分了sEMG信号。
这项初步研究证明了使用定制的传感器放置和个性化可穿戴设备基于sEMG进行无喉语音识别的可行性。对该方法的进一步完善可以使无声说出的语音通过便携式设备转化为合成的有声语音。