使用生成对抗网络进行语音合成以提高印地语单词的可读性，帮助诵读困难者恢复。

Speech synthesis using generative adversarial network for improving readability of Hindi words to recuperate from dyslexia.

作者信息

Atkar Geeta, Jayaraju Priyadarshini

机构信息

School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, 600127 India.

出版信息

Neural Comput Appl. 2021;33(15):9353-9362. doi: 10.1007/s00521-021-05695-3. Epub 2021 Feb 15.

DOI:10.1007/s00521-021-05695-3

PMID:33612979

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7883547/

Abstract

Children learn and develop their abilities at their own pace. One of the most basic skills that they acquire is reading. However, some children struggle with reading longer than their friends, and in such a case, it is possible that they have a learning disorder known as dyslexia. The paper aims to use neural networks, namely generative neural networks, for generating raw audio data of two- or three-letter Hindi words. Using the generated data, a system will be built that will pronounce generated words for children recuperating from dyslexia. The system aims to be an effective helping tool for teachers to speed up the recuperation process by making the child repeat the correct pronunciation of the word. The system uses advance Mel-generative adversarial network neural network for working with Mel-spectrograms of the raw audio, by which the system will model its own audio iteratively, until a satisfactory result is achieved. Generated audio sample contains the Hindi words which will be taught to children. Mel-generative adversarial network will be used to generate audio samples since it provides better results compared to other existing models. 300 basic two- or three-letter Hindi words are taken as an input for assisting 5- to 8-year children. Minimum opinion score is calculated for comparison.

摘要

儿童按照自己的节奏学习和发展能力。他们掌握的最基本技能之一就是阅读。然而，一些孩子在阅读方面比同龄人更吃力，在这种情况下，他们可能患有一种名为诵读困难症的学习障碍。本文旨在使用神经网络，即生成式神经网络，来生成两三个字母的印地语单词的原始音频数据。利用生成的数据，将构建一个系统，为正在从诵读困难症中恢复的儿童读出所生成的单词。该系统旨在成为教师的有效辅助工具，通过让孩子重复单词的正确发音来加速恢复过程。该系统使用先进的梅尔生成对抗网络神经网络来处理原始音频的梅尔频谱图，通过这种方式，系统将迭代地对自己的音频进行建模，直到获得满意的结果。生成的音频样本包含将教给孩子们的印地语单词。由于梅尔生成对抗网络与其他现有模型相比能提供更好的结果，因此将用于生成音频样本。选取300个基本的两三个字母的印地语单词作为输入，以帮助5至8岁的儿童。计算最小意见得分用于比较。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f600/7883547/c620d8ce704e/521_2021_5695_Fig1_HTML.jpg

相似文献

Speech synthesis using generative adversarial network for improving readability of Hindi words to recuperate from dyslexia.使用生成对抗网络进行语音合成以提高印地语单词的可读性，帮助诵读困难者恢复。

Neural Comput Appl. 2021;33(15):9353-9362. doi: 10.1007/s00521-021-05695-3. Epub 2021 Feb 15.

CiwGAN and fiwGAN: Encoding information in acoustic data to model lexical learning with Generative Adversarial Networks.CiwGAN 和 fiwGAN：利用生成对抗网络将声学数据中的信息编码，以建模词汇学习。

Neural Netw. 2021 Jul;139:305-325. doi: 10.1016/j.neunet.2021.03.017. Epub 2021 Mar 19.

Generative Adversarial Phonology: Modeling Unsupervised Phonetic and Phonological Learning With Neural Networks.生成对抗语音学：使用神经网络对无监督语音和音系学习进行建模

Front Artif Intell. 2020 Jul 8;3:44. doi: 10.3389/frai.2020.00044. eCollection 2020.

Time Series Forecasting and Classification Models Based on Recurrent with Attention Mechanism and Generative Adversarial Networks.基于循环注意力机制和生成对抗网络的时间序列预测和分类模型。

Sensors (Basel). 2020 Dec 16;20(24):7211. doi: 10.3390/s20247211.

End-to-End Video-to-Speech Synthesis Using Generative Adversarial Networks.端到端的基于生成对抗网络的视频到语音合成。

IEEE Trans Cybern. 2023 Jun;53(6):3454-3466. doi: 10.1109/TCYB.2022.3162495. Epub 2023 May 17.

Conditional generative adversarial network for 3D rigid-body motion correction in MRI.条件生成对抗网络在 MRI 中用于 3D 刚体运动校正。

Magn Reson Med. 2019 Sep;82(3):901-910. doi: 10.1002/mrm.27772. Epub 2019 Apr 22.

Transformer- and Generative Adversarial Network-Based Inpatient Traditional Chinese Medicine Prescription Recommendation: Development Study.基于Transformer和生成对抗网络的住院患者中医处方推荐：开发研究

JMIR Med Inform. 2022 May 31;10(5):e35239. doi: 10.2196/35239.

μ-law SGAN for generating spectra with more details in speech enhancement.μ 律 SGAN 用于语音增强中生成具有更多细节的频谱。

Neural Netw. 2021 Apr;136:17-27. doi: 10.1016/j.neunet.2020.12.017. Epub 2020 Dec 25.

Developmental dyslexia in Hindi readers: Is consistent sound-symbol mapping an asset in reading? Evidence from phonological and visuospatial working memory.发展性阅读障碍在印地语读者中的表现：一致性的音形对应关系是否有助于阅读？来自语音和视空间工作记忆的证据。

Dyslexia. 2019 Nov;25(4):390-410. doi: 10.1002/dys.1632. Epub 2019 Aug 20.

Generative adversarial networks with decoder-encoder output noises.生成对抗网络与解码器编码器输出噪声。

Neural Netw. 2020 Jul;127:19-28. doi: 10.1016/j.neunet.2020.04.005. Epub 2020 Apr 9.

引用本文的文献

The prevalence of mathematical difficulties among primary school children in Mainland China: a systematic review and meta-analysis.中国大陆小学生数学困难的患病率：系统评价和荟萃分析。

Front Public Health. 2024 Feb 8;11:1250337. doi: 10.3389/fpubh.2023.1250337. eCollection 2023.

Enhanced dataset synthesis using conditional generative adversarial networks.使用条件生成对抗网络增强数据集合成

Biomed Eng Lett. 2022 Nov 20;13(1):41-48. doi: 10.1007/s13534-022-00251-x. eCollection 2023 Feb.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用生成对抗网络进行语音合成以提高印地语单词的可读性，帮助诵读困难者恢复。

Speech synthesis using generative adversarial network for improving readability of Hindi words to recuperate from dyslexia.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献