利用白鹭优化算法和深度学习增强人机交互的多语言识别。

Enhancing human computer interaction with coot optimization and deep learning for multi language identification.

机构信息

Candidate of Economic Sciences, Department of Economics and Management, Kazan Federal University, Elabuga Institute of KFU, Elabuga, 423604, Russia.

Moscow Aviation Institute (National Research University), Moscow, 125080, Russia.

出版信息

Sci Rep. 2024 Oct 3;14(1):22963. doi: 10.1038/s41598-024-74327-2.

DOI:10.1038/s41598-024-74327-2

PMID:39362948

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11450161/

Abstract

Human-Computer Interaction (HCI) is a multidisciplinary field focused on designing and utilizing computer technology, underlining the interaction interface between computers and humans. HCI aims to generate systems that allow consumers to relate to computers effectively, efficiently, and pleasantly. Multiple Spoken Language Identification (SLI) for HCI (MSLI for HCI) denotes the ability of a computer system to recognize and distinguish various spoken languages to enable more complete and handy interactions among consumers and technology. SLI utilizing deep learning (DL) involves using artificial neural networks (ANNs), a subset of DL models, to automatically detect and recognize the language spoken in an audio signal. DL techniques, particularly neural networks (NNs), have succeeded in various pattern detection tasks, including speech and language processing. This paper develops a novel Coot Optimizer Algorithm with a DL-Driven Multiple SLI and Detection (COADL-MSLID) technique for HCI applications. The COADL-MSLID approach aims to detect multiple spoken languages from the input audio regardless of gender, speaking style, and age. In the COADL-MSLID technique, the audio files are transformed into spectrogram images as a primary step. Besides, the COADL-MSLID technique employs the SqueezeNet model to produce feature vectors, and the COA is applied to the hyperparameter range of the SqueezeNet method. The COADL-MSLID technique exploits the SLID process's convolutional autoencoder (CAE) model. To underline the importance of the COADL-MSLID technique, a series of experiments were conducted on the benchmark dataset. The experimentation validation of the COADL-MSLID technique exhibits a greater accuracy result of 98.33% over other techniques.

摘要

人机交互 (HCI) 是一个多学科领域，专注于设计和利用计算机技术，强调计算机和人类之间的交互界面。HCI 的目标是生成允许消费者与计算机有效、高效和愉快地交互的系统。用于 HCI 的多语言识别 (MSLI) 表示计算机系统识别和区分各种语言的能力，以实现消费者和技术之间更完整和便捷的交互。利用深度学习 (DL) 的 SLI 涉及使用人工神经网络 (ANN)，即 DL 模型的一个子集，自动检测和识别音频信号中所说的语言。DL 技术，特别是神经网络 (NN)，在各种模式检测任务中取得了成功，包括语音和语言处理。本文为 HCI 应用开发了一种新颖的 COOT 优化器算法与 DL 驱动的多 SLI 和检测 (COADL-MSLID) 技术。COADL-MSLID 方法旨在从输入音频中检测多种语言，而不考虑性别、说话风格和年龄。在 COADL-MSLID 技术中，音频文件首先转换为频谱图图像。此外，COADL-MSLID 技术采用 SqueezeNet 模型生成特征向量，并将 COA 应用于 SqueezeNet 方法的超参数范围。COADL-MSLID 技术利用 SLID 过程的卷积自动编码器 (CAE) 模型。为了强调 COADL-MSLID 技术的重要性，在基准数据集上进行了一系列实验。COADL-MSLID 技术的实验验证显示，其准确性结果比其他技术高 98.33%。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

利用白鹭优化算法和深度学习增强人机交互的多语言识别。

Enhancing human computer interaction with coot optimization and deep learning for multi language identification.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

利用白鹭优化算法和深度学习增强人机交互的多语言识别。

Enhancing human computer interaction with coot optimization and deep learning for multi language identification.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献