Ahmad Hawraz A, Rashid Tarik A
Department of Software and Informatics Engineering, Salahaddin University-Erbil, Erbil, KR, Iraq.
Department of Computer Science and Engineering, University of Kurdistan Hewler, Erbil, KR, Iraq.
Data Brief. 2024 Jul 14;55:110753. doi: 10.1016/j.dib.2024.110753. eCollection 2024 Aug.
Today, speech synthesis is a part of our daily lives in computers all around the world. Central Kurdish Speech Corpus Construction is a speech corpus that is a primary data source for developing a speech system. There are still two main issues that prevent them from achieving the best possible performance, the lack of efficiency in training and analysis, and the difficulty in modelling. The biggest obstacle against text-to-speech in the Kurdish language is that there is a lack of text and speech recognition tools compounded by the fact that around 30 million people speak the Kurdish language in different countries. To address this issue, this corpus introduced a large vocabulary of Kurdish Text-to-Speech Dataset (KTTS, Gigant), including a pronunciation lexicon and speech corpus for the Central Kurdish dialect. A variety of subjects is comprised to record these sentences. The sentences are recorded in a voice recording studio by a Kurdish man who is a dubber. The goal of the speech corpus is to create a collection of sentences that accurately reflect the real data about the Central Kurdish dialect. A combination of audio and visual sources is used to record the 6,078 sentences of 12 document topics. They were recorded in a controlled environment using microphones that were not noisy. The total record duration is 13.63 h. The recorded sentences are in the ".wav" format.
如今,语音合成已成为全球计算机日常生活的一部分。库尔德语中部语音语料库建设是一个语音语料库,是开发语音系统的主要数据源。仍然存在两个主要问题阻碍它们实现最佳性能,即训练和分析效率低下以及建模困难。库尔德语语音合成面临的最大障碍是缺乏文本和语音识别工具,再加上不同国家约有3000万人讲库尔德语。为了解决这个问题,该语料库引入了一个庞大的库尔德语语音合成数据集(KTTS,Gigant)词汇表,包括库尔德语中部方言的发音词典和语音语料库。录制这些句子涵盖了各种主题。这些句子由一名库尔德配音演员在录音室录制。语音语料库的目标是创建一组能够准确反映库尔德语中部方言真实数据的句子。音频和视觉源相结合用于录制12个文档主题的6078个句子。它们在使用无噪音麦克风的受控环境中录制。总录制时长为13.63小时。录制的句子为“.wav”格式。