• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于大规模语音任务的深度卷积神经网络。

Deep Convolutional Neural Networks for large-scale speech tasks.

机构信息

IBM T. J. Watson Research Center, Yorktown Heights, NY 10598, United States.

Department of Computer Science, University of Toronto, United States.

出版信息

Neural Netw. 2015 Apr;64:39-48. doi: 10.1016/j.neunet.2014.08.005. Epub 2014 Sep 16.

DOI:10.1016/j.neunet.2014.08.005
PMID:25439765
Abstract

Convolutional Neural Networks (CNNs) are an alternative type of neural network that can be used to reduce spectral variations and model spectral correlations which exist in signals. Since speech signals exhibit both of these properties, we hypothesize that CNNs are a more effective model for speech compared to Deep Neural Networks (DNNs). In this paper, we explore applying CNNs to large vocabulary continuous speech recognition (LVCSR) tasks. First, we determine the appropriate architecture to make CNNs effective compared to DNNs for LVCSR tasks. Specifically, we focus on how many convolutional layers are needed, what is an appropriate number of hidden units, what is the best pooling strategy. Second, investigate how to incorporate speaker-adapted features, which cannot directly be modeled by CNNs as they do not obey locality in frequency, into the CNN framework. Third, given the importance of sequence training for speech tasks, we introduce a strategy to use ReLU+dropout during Hessian-free sequence training of CNNs. Experiments on 3 LVCSR tasks indicate that a CNN with the proposed speaker-adapted and ReLU+dropout ideas allow for a 12%-14% relative improvement in WER over a strong DNN system, achieving state-of-the art results in these 3 tasks.

摘要

卷积神经网络(CNNs)是一种替代类型的神经网络,可用于减少信号中存在的光谱变化和光谱相关性建模。由于语音信号同时具有这两种特性,因此我们假设与深度神经网络(DNNs)相比,CNN 是一种更有效的语音模型。在本文中,我们探讨了将 CNN 应用于大词汇量连续语音识别(LVCSR)任务。首先,我们确定适当的架构,使 CNN 相对于 DNN 更有效地用于 LVCSR 任务。具体来说,我们专注于需要多少个卷积层,多少个隐藏单元是合适的,以及最佳的池化策略是什么。其次,研究如何将 speaker-adapted 特征(由于它们在频率上不服从局部性,因此不能直接由 CNN 建模)纳入 CNN 框架。第三,鉴于序列训练对于语音任务的重要性,我们引入了一种在 CNN 的 Hessian-free 序列训练期间使用 ReLU+dropout 的策略。在 3 个 LVCSR 任务上的实验表明,与强大的 DNN 系统相比,具有所提出的适用于 speaker-adapted 和 ReLU+dropout 思想的 CNN 可以使 WER 相对提高 12%-14%,在这 3 个任务中达到了最新的结果。

相似文献

1
Deep Convolutional Neural Networks for large-scale speech tasks.用于大规模语音任务的深度卷积神经网络。
Neural Netw. 2015 Apr;64:39-48. doi: 10.1016/j.neunet.2014.08.005. Epub 2014 Sep 16.
2
Brain tumor segmentation with Deep Neural Networks.基于深度神经网络的脑肿瘤分割。
Med Image Anal. 2017 Jan;35:18-31. doi: 10.1016/j.media.2016.05.004. Epub 2016 May 19.
3
Training Lightweight Deep Convolutional Neural Networks Using Bag-of-Features Pooling.使用特征袋池化训练轻量级深度卷积神经网络
IEEE Trans Neural Netw Learn Syst. 2019 Jun;30(6):1705-1715. doi: 10.1109/TNNLS.2018.2872995. Epub 2018 Oct 24.
4
Towards dropout training for convolutional neural networks.面向卷积神经网络的随机失活训练
Neural Netw. 2015 Nov;71:1-10. doi: 10.1016/j.neunet.2015.07.007. Epub 2015 Jul 29.
5
Representations of regular and irregular shapes by deep Convolutional Neural Networks, monkey inferotemporal neurons and human judgments.深度卷积神经网络、猴子下颞叶神经元和人类判断对规则和不规则形状的表示。
PLoS Comput Biol. 2018 Oct 26;14(10):e1006557. doi: 10.1371/journal.pcbi.1006557. eCollection 2018 Oct.
6
Machine learning based sample extraction for automatic speech recognition using dialectal Assamese speech.基于机器学习的方言阿萨姆语语音自动识别样本提取。
Neural Netw. 2016 Jun;78:97-111. doi: 10.1016/j.neunet.2015.12.010. Epub 2015 Dec 30.
7
Fast learning method for convolutional neural networks using extreme learning machine and its application to lane detection.基于极端学习机的卷积神经网络快速学习方法及其在车道检测中的应用。
Neural Netw. 2017 Mar;87:109-121. doi: 10.1016/j.neunet.2016.12.002. Epub 2016 Dec 10.
8
Convolutional neural network architectures for predicting DNA-protein binding.用于预测DNA-蛋白质结合的卷积神经网络架构。
Bioinformatics. 2016 Jun 15;32(12):i121-i127. doi: 10.1093/bioinformatics/btw255.
9
Deep Convolutional Neural Networks for breast cancer screening.深度学习卷积神经网络在乳腺癌筛查中的应用。
Comput Methods Programs Biomed. 2018 Apr;157:19-30. doi: 10.1016/j.cmpb.2018.01.011. Epub 2018 Jan 11.
10
Segmentation of organs-at-risks in head and neck CT images using convolutional neural networks.使用卷积神经网络对头颈部CT图像中的危险器官进行分割。
Med Phys. 2017 Feb;44(2):547-557. doi: 10.1002/mp.12045.

引用本文的文献

1
Opportunities and challenges with artificial intelligence in allergy and immunology: a bibliometric study.人工智能在过敏与免疫学领域的机遇与挑战:一项文献计量学研究
Front Med (Lausanne). 2025 Apr 9;12:1523902. doi: 10.3389/fmed.2025.1523902. eCollection 2025.
2
Wearable blood pressure sensors for cardiovascular monitoring and machine learning algorithms for blood pressure estimation.用于心血管监测的可穿戴血压传感器以及用于血压估计的机器学习算法。
Nat Rev Cardiol. 2025 Feb 18. doi: 10.1038/s41569-025-01127-0.
3
CNN-Based Neurodegenerative Disease Classification Using QR-Represented Gait Data.
基于 CNN 的神经退行性疾病分类方法,使用 QR 表示的步态数据。
Brain Behav. 2024 Oct;14(10):e70100. doi: 10.1002/brb3.70100.
4
Fractal Geometry Meets Computational Intelligence: Future Perspectives.分形几何与计算智能:未来展望。
Adv Neurobiol. 2024;36:983-997. doi: 10.1007/978-3-031-47606-8_48.
5
Deep learning reduces data requirements and allows real-time measurements in imaging FCS.深度学习减少了数据需求,并允许在 FCS 成像中进行实时测量。
Biophys J. 2024 Mar 19;123(6):655-666. doi: 10.1016/j.bpj.2023.11.3403. Epub 2023 Dec 4.
6
Toward memristive in-memory computing: principles and applications.迈向忆阻式内存计算:原理与应用
Front Optoelectron. 2022 May 12;15(1):23. doi: 10.1007/s12200-022-00025-4.
7
Bandwidth Improvement in Ultrasound Image Reconstruction Using Deep Learning Techniques.使用深度学习技术提升超声图像重建中的带宽
Healthcare (Basel). 2022 Dec 30;11(1):123. doi: 10.3390/healthcare11010123.
8
A quantitative identification method based on CWT and CNN for external and inner broken wires of steel wire ropes.一种基于连续小波变换和卷积神经网络的钢丝绳外部和内部断丝定量识别方法。
Heliyon. 2022 Nov 15;8(11):e11623. doi: 10.1016/j.heliyon.2022.e11623. eCollection 2022 Nov.
9
A novel hybrid model for six main pollutant concentrations forecasting based on improved LSTM neural networks.基于改进的 LSTM 神经网络的六种主要污染物浓度预测的新型混合模型。
Sci Rep. 2022 Aug 24;12(1):14434. doi: 10.1038/s41598-022-17754-3.
10
Assessing the robustness of radiomics/deep learning approach in the identification of efficacy of anti-PD-1 treatment in advanced or metastatic non-small cell lung carcinoma patients.评估影像组学/深度学习方法在识别晚期或转移性非小细胞肺癌患者抗PD-1治疗疗效方面的稳健性。
Front Oncol. 2022 Aug 5;12:952749. doi: 10.3389/fonc.2022.952749. eCollection 2022.