Department of Electrical & Computer Engineering, National University of Singapore, Singapore, Singapore.
National University of Singapore Suzhou Research Institute (NUSRI), Suzhou, China.
Nat Commun. 2021 Sep 10;12(1):5378. doi: 10.1038/s41467-021-25637-w.
Sign language recognition, especially the sentence recognition, is of great significance for lowering the communication barrier between the hearing/speech impaired and the non-signers. The general glove solutions, which are employed to detect motions of our dexterous hands, only achieve recognizing discrete single gestures (i.e., numbers, letters, or words) instead of sentences, far from satisfying the meet of the signers' daily communication. Here, we propose an artificial intelligence enabled sign language recognition and communication system comprising sensing gloves, deep learning block, and virtual reality interface. Non-segmentation and segmentation assisted deep learning model achieves the recognition of 50 words and 20 sentences. Significantly, the segmentation approach splits entire sentence signals into word units. Then the deep learning model recognizes all word elements and reversely reconstructs and recognizes sentences. Furthermore, new/never-seen sentences created by new-order word elements recombination can be recognized with an average correct rate of 86.67%. Finally, the sign language recognition results are projected into virtual space and translated into text and audio, allowing the remote and bidirectional communication between signers and non-signers.
手语识别,尤其是句子识别,对于降低听障人士和非手语使用者之间的交流障碍具有重要意义。通常的手套解决方案,用于检测我们灵巧的手部运动,只能识别离散的单个手势(即数字、字母或单词),而不是句子,远远不能满足手语使用者日常交流的需求。在这里,我们提出了一个人工智能辅助的手语识别和通信系统,包括感应手套、深度学习模块和虚拟现实接口。非分割和分割辅助深度学习模型实现了 50 个单词和 20 个句子的识别。重要的是,分割方法将整个句子信号分割成单词单元。然后,深度学习模型识别所有单词元素,并反向重建和识别句子。此外,通过重新组合新的/从未见过的单词元素,可以识别新的/从未见过的句子,平均正确率为 86.67%。最后,手语识别结果被投射到虚拟空间中,并转化为文本和音频,允许手语使用者和非手语使用者之间进行远程双向通信。