• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于生物启发式时频表示和卷积神经网络的语音命令识别

Voice Command Recognition Using Biologically Inspired Time-Frequency Representation and Convolutional Neural Networks.

作者信息

Sharan Roneel V, Berkovsky Shlomo, Liu Sidong

出版信息

Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:998-1001. doi: 10.1109/EMBC44109.2020.9176006.

DOI:10.1109/EMBC44109.2020.9176006
PMID:33018153
Abstract

Voice command is an important interface between human and technology in healthcare, such as for hands-free control of surgical robots and in patient care technology. Voice command recognition can be cast as a speech classification task, where convolutional neural networks (CNNs) have demonstrated strong performance. CNN is originally an image classification technique and time-frequency representation of speech signals is the most commonly used image-like representation for CNNs. Various types of time-frequency representations are commonly used for this purpose. This work investigates the use of cochleagram, utilizing a gammatone filter which models the frequency selectivity of the human cochlea, as the time-frequency representation of voice commands and input for the CNN classifier. We also explore multi-view CNN as a technique for combining learning from different time-frequency representations. The proposed method is evaluated on a large dataset and shown to achieve high classification accuracy.

摘要

语音指令是医疗保健领域中人与技术之间的重要接口,例如用于免提控制手术机器人以及患者护理技术。语音指令识别可以被视为一项语音分类任务,卷积神经网络(CNN)在该任务中已展现出强大的性能。CNN最初是一种图像分类技术,语音信号的时频表示是CNN最常用的类似图像的表示形式。为此通常会使用各种类型的时频表示。这项工作研究了使用耳蜗图,利用模拟人类耳蜗频率选择性的伽马通滤波器,作为语音指令的时频表示以及CNN分类器的输入。我们还探索了多视图CNN,作为一种结合来自不同时频表示学习的技术。所提出的方法在一个大型数据集上进行了评估,并显示出具有很高的分类准确率。

相似文献

1
Voice Command Recognition Using Biologically Inspired Time-Frequency Representation and Convolutional Neural Networks.基于生物启发式时频表示和卷积神经网络的语音命令识别
Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:998-1001. doi: 10.1109/EMBC44109.2020.9176006.
2
Age and Gender Recognition Using a Convolutional Neural Network with a Specially Designed Multi-Attention Module through Speech Spectrograms.基于语音声谱图的卷积神经网络与特别设计的多注意力模块的年龄与性别识别
Sensors (Basel). 2021 Sep 1;21(17):5892. doi: 10.3390/s21175892.
3
Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks.基于卷积神经网络的音频信号分类技术的基准测试。
Sensors (Basel). 2021 May 14;21(10):3434. doi: 10.3390/s21103434.
4
Orthogonal convolutional neural networks for automatic sleep stage classification based on single-channel EEG.基于单通道 EEG 的自动睡眠分期的正交卷积神经网络。
Comput Methods Programs Biomed. 2020 Jan;183:105089. doi: 10.1016/j.cmpb.2019.105089. Epub 2019 Sep 27.
5
Voice pathology detection using optimized convolutional neural networks and explainable artificial intelligence-based analysis.基于优化卷积神经网络和可解释人工智能的语音病理学检测。
Comput Methods Biomech Biomed Engin. 2024 Nov;27(14):2041-2057. doi: 10.1080/10255842.2023.2270102. Epub 2023 Oct 18.
6
Cascaded Convolutional Neural Network Architecture for Speech Emotion Recognition in Noisy Conditions.用于噪声环境下语音情感识别的级联卷积神经网络架构
Sensors (Basel). 2021 Jun 27;21(13):4399. doi: 10.3390/s21134399.
7
Convolutional Neural Networks for Pathological Voice Detection.用于病理性语音检测的卷积神经网络
Annu Int Conf IEEE Eng Med Biol Soc. 2018 Jul;2018:1-4. doi: 10.1109/EMBC.2018.8513222.
8
Voice pathology detection and classification from speech signals and EGG signals based on a multimodal fusion method.基于多模态融合方法的语音信号和 EEG 信号的语音病理学检测与分类。
Biomed Tech (Berl). 2021 Nov 29;66(6):613-625. doi: 10.1515/bmt-2021-0112. Print 2021 Dec 20.
9
DNN Filter Bank Improves 1-Max Pooling CNN for Single-Channel EEG Automatic Sleep Stage Classification.深度神经网络滤波器组改进了用于单通道脑电图自动睡眠阶段分类的1-最大池化卷积神经网络。
Annu Int Conf IEEE Eng Med Biol Soc. 2018 Jul;2018:453-456. doi: 10.1109/EMBC.2018.8512286.
10
Configuration-Invariant Sound Localization Technique Using Azimuth-Frequency Representation and Convolutional Neural Networks.利用方位-频率表示和卷积神经网络的配置不变声定位技术。
Sensors (Basel). 2020 Jul 5;20(13):3768. doi: 10.3390/s20133768.

引用本文的文献

1
Spike encoding techniques for IoT time-varying signals benchmarked on a neuromorphic classification task.用于物联网时变信号的尖峰编码技术在神经形态分类任务上的基准测试。
Front Neurosci. 2022 Dec 21;16:999029. doi: 10.3389/fnins.2022.999029. eCollection 2022.
2
Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks.基于卷积神经网络的音频信号分类技术的基准测试。
Sensors (Basel). 2021 May 14;21(10):3434. doi: 10.3390/s21103434.