Suppr超能文献

使用生成对抗网络进行语音合成以提高印地语单词的可读性,帮助诵读困难者恢复。

Speech synthesis using generative adversarial network for improving readability of Hindi words to recuperate from dyslexia.

作者信息

Atkar Geeta, Jayaraju Priyadarshini

机构信息

School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, 600127 India.

出版信息

Neural Comput Appl. 2021;33(15):9353-9362. doi: 10.1007/s00521-021-05695-3. Epub 2021 Feb 15.

Abstract

Children learn and develop their abilities at their own pace. One of the most basic skills that they acquire is reading. However, some children struggle with reading longer than their friends, and in such a case, it is possible that they have a learning disorder known as dyslexia. The paper aims to use neural networks, namely generative neural networks, for generating raw audio data of two- or three-letter Hindi words. Using the generated data, a system will be built that will pronounce generated words for children recuperating from dyslexia. The system aims to be an effective helping tool for teachers to speed up the recuperation process by making the child repeat the correct pronunciation of the word. The system uses advance Mel-generative adversarial network neural network for working with Mel-spectrograms of the raw audio, by which the system will model its own audio iteratively, until a satisfactory result is achieved. Generated audio sample contains the Hindi words which will be taught to children. Mel-generative adversarial network will be used to generate audio samples since it provides better results compared to other existing models. 300 basic two- or three-letter Hindi words are taken as an input for assisting 5- to 8-year children. Minimum opinion score is calculated for comparison.

摘要

儿童按照自己的节奏学习和发展能力。他们掌握的最基本技能之一就是阅读。然而,一些孩子在阅读方面比同龄人更吃力,在这种情况下,他们可能患有一种名为诵读困难症的学习障碍。本文旨在使用神经网络,即生成式神经网络,来生成两三个字母的印地语单词的原始音频数据。利用生成的数据,将构建一个系统,为正在从诵读困难症中恢复的儿童读出所生成的单词。该系统旨在成为教师的有效辅助工具,通过让孩子重复单词的正确发音来加速恢复过程。该系统使用先进的梅尔生成对抗网络神经网络来处理原始音频的梅尔频谱图,通过这种方式,系统将迭代地对自己的音频进行建模,直到获得满意的结果。生成的音频样本包含将教给孩子们的印地语单词。由于梅尔生成对抗网络与其他现有模型相比能提供更好的结果,因此将用于生成音频样本。选取300个基本的两三个字母的印地语单词作为输入,以帮助5至8岁的儿童。计算最小意见得分用于比较。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f600/7883547/c620d8ce704e/521_2021_5695_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验