Department of Electronics and Communication Engineering, National Institute of Technology, Durgapur, India.
Disabil Rehabil Assist Technol. 2024 Jan;19(1):233-246. doi: 10.1080/17483107.2022.2078898. Epub 2022 May 26.
The article presents a design and development of a generic assistive system to establish an independent conversation-platform for hearing-speech impaired and visually impaired persons.
The developed software system is accomplished through programming using python and html.
Considering the constraints associated to the above mentioned impairments, the system implements both speech-to-text/gesture and text/gesture-to-speech conversion in its operation. In real-time hand-gesture to speech generation process is implemented using static image tracking, CNN based deep learning technique and MediaPipe hand-tracking solution. The software-prototype-terminals can be accessed through internet using MQTT protocol to accomplish the communicative conversation between visually impaired and hearing-speech impaired persons.
The software system exhibits an average prediction time of less than approximately 1 s and 2 s for a four-letter based audio-word and a single hand-gesture, respectively, which are commensurate to the average time-complexity during human-to-human conversation. The average accuracy and loss for the hand-gestures through the CNN based deep learning are 0.9996 and 0.0008, respectively. The confusion matrix related to the prediction of alphabet-specific hand-gestures shows its satisfactory performance in gesture recognition.
The software-prototype of the generic assistive device shows its potential to establish an exclusive communication between a visually impaired and a hearing-speech impaired person through the internet. The same software-interface can also be used to accomplish a communicative conversation between either only visually-impaired persons or only hearing-speech impaired persons. IMPLICATIONS FOR REHABILITATIONThe article presents a design and development of a generic assistive interface to establish an independent conversation-platform for hearing-speech impaired and visually impaired people via internet network.The same software-interface can also be used to accomplish a communicative conversation between either only visually-impaired persons or only hearing-speech impaired persons.The design can be further extended by incorporating multi-modal impairments to make a universal assistive device for all-in-one communication.
本文提出了一种通用辅助系统的设计和开发,旨在为听障和视障人士建立一个独立的对话平台。
开发的软件系统是通过使用 python 和 html 进行编程完成的。
考虑到上述障碍相关的限制,系统在其操作中同时实现了语音到文本/手势和文本/手势到语音的转换。在实时手部手势到语音生成过程中,使用静态图像跟踪、基于 CNN 的深度学习技术和 MediaPipe 手部跟踪解决方案来实现。软件原型终端可以通过使用 MQTT 协议访问互联网,以实现视障人士和听障人士之间的交际对话。
软件系统的平均预测时间分别小于约 1 秒和 2 秒,用于基于四个字母的音频词和单个手部手势,与人类对话的平均时间复杂度相当。基于 CNN 的深度学习的手部手势的平均准确率和损失分别为 0.9996 和 0.0008。与预测字母特定手部手势相关的混淆矩阵显示了其在手势识别方面的令人满意的性能。
通用辅助设备的软件原型显示了通过互联网在视障人士和听障人士之间建立专用通信的潜力。相同的软件界面也可用于在仅视障人士或仅听障人士之间完成交际对话。
本文提出了一种通用辅助接口的设计和开发,通过互联网为听障和视障人士建立一个独立的对话平台。相同的软件界面也可用于在仅视障人士或仅听障人士之间完成交际对话。通过结合多种模式的障碍,可以进一步扩展设计,使其成为一个通用的辅助设备,实现一站式通信。