Department of Electronics and Information Engineering, Xi'an Technological University, Xi'an City, 710032, China.
Department of Electronics and Information Engineering, Xi'an Technological University, Xi'an City, 710032, China.
Neural Netw. 2020 May;125:41-55. doi: 10.1016/j.neunet.2020.01.030. Epub 2020 Feb 6.
Chinese sign language (CSL) is one of the most widely used sign language systems in the world. As such, the automatic recognition and generation of CSL is a key technology enabling bidirectional communication between deaf and hearing people. Most previous studies have focused solely on sign language recognition (SLR), which only addresses communication in a single direction. As such, there is a need for sign language generation (SLG) to enable communication in the other direction (i.e., from hearing people to deaf people). To achieve a smoother exchange of ideas between these two groups, we propose a skeleton-based CSL recognition and generation framework based on a recurrent neural network (RNN), to support bidirectional CSL communication. This process can also be extended to other sequence-to-sequence information interactions. The core of the proposed framework is a two-level probability generative model. Compared with previous techniques, this approach offers a more flexible approximate posterior distribution, which can produce skeletal sequences of varying styles that are recognizable to humans. In addition, the proposed generation method compensated for a lack of training data. A series of experiments in bidirectional communication were conducted on the large 500 CSL dataset. The proposed algorithm achieved high recognition accuracy for both real and synthetic data, with a reduced runtime. Furthermore, the generated data improved the performance of the discriminator. These results suggest the proposed bidirectional communication framework and generation algorithm to be an effective new approach to CSL recognition.
中国手语(CSL)是世界上使用最广泛的手语系统之一。因此,CSL 的自动识别和生成是实现聋人和听力人双向交流的关键技术。大多数先前的研究仅侧重于手语识别(SLR),它仅解决了单向交流的问题。因此,需要手语生成(SLG)来实现另一个方向(即从听力人到聋人)的交流。为了实现这两个群体之间更顺畅的思想交流,我们提出了一种基于循环神经网络(RNN)的基于骨架的 CSL 识别和生成框架,以支持双向 CSL 通信。这个过程也可以扩展到其他序列到序列的信息交互。所提出的框架的核心是一个两级概率生成模型。与以前的技术相比,这种方法提供了更灵活的近似后验分布,可以生成可被人类识别的不同风格的骨架序列。此外,所提出的生成方法弥补了训练数据的不足。在大型的 500 CSL 数据集上进行了一系列双向通信实验。所提出的算法在真实和合成数据上都实现了高识别精度,同时运行时间缩短。此外,生成的数据提高了鉴别器的性能。这些结果表明,所提出的双向通信框架和生成算法是 CSL 识别的一种有效新方法。