School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Durban, South Africa.
Comput Intell Neurosci. 2022 Sep 28;2022:3364141. doi: 10.1155/2022/3364141. eCollection 2022.
Classification of isolated digits is the basic challenge for many speech classification systems. While a lot of work has been carried out on spoken languages, only limited research work on spoken English digit data has been reported in the literature. The paper proposes an intelligent-based system based on deep feedforward neural network (DFNN) with hyperparameter optimization techniques, an ensemble method; random forest (RF), and a regression method; gradient boosting (GB) for the classification of spoken digit data. The paper investigates different machine learning (ML) algorithms to determine the best method for the classification of spoken English digit data. The DFNN classifier outperformed the RF and GB classifiers on the public benchmark spoken English digit data and achieved 99.65% validation accuracy. The outcome of the proposed model performs better compared to existing models with only traditional classifiers.
孤立数字的分类是许多语音分类系统的基本挑战。虽然已经有很多针对口语的研究工作,但文献中仅报道了针对英语口语数字数据的有限研究工作。本文提出了一种基于深度前馈神经网络(DFNN)的智能系统,该系统结合了超参数优化技术、集成方法——随机森林(RF)和回归方法——梯度提升(GB),用于英语口语数字数据的分类。本文研究了不同的机器学习(ML)算法,以确定用于英语口语数字数据分类的最佳方法。在公共基准英语口语数字数据上,DFNN 分类器优于 RF 和 GB 分类器,验证准确率达到 99.65%。与仅使用传统分类器的现有模型相比,所提出模型的结果更好。