• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

语音与视觉系统中的深度神经网络调查

Survey on Deep Neural Networks in Speech and Vision Systems.

作者信息

Alam M, Samad M D, Vidyaratne L, Glandon A, Iftekharuddin K M

机构信息

Department of Computer Science, Tennessee State University, Nashville, TN, 37209.

出版信息

Neurocomputing (Amst). 2020 Dec 5;417:302-321. doi: 10.1016/j.neucom.2020.07.053. Epub 2020 Jul 26.

DOI:10.1016/j.neucom.2020.07.053
PMID:33100581
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7584105/
Abstract

This survey presents a review of state-of-the-art deep neural network architectures, algorithms, and systems in vision and speech applications. Recent advances in deep artificial neural network algorithms and architectures have spurred rapid innovation and development of intelligent vision and speech systems. With availability of vast amounts of sensor data and cloud computing for processing and training of deep neural networks, and with increased sophistication in mobile and embedded technology, the next-generation intelligent systems are poised to revolutionize personal and commercial computing. This survey begins by providing background and evolution of some of the most successful deep learning models for intelligent vision and speech systems to date. An overview of large-scale industrial research and development efforts is provided to emphasize future trends and prospects of intelligent vision and speech systems. Robust and efficient intelligent systems demand low-latency and high fidelity in resource-constrained hardware platforms such as mobile devices, robots, and automobiles. Therefore, this survey also provides a summary of key challenges and recent successes in running deep neural networks on hardware-restricted platforms, i.e. within limited memory, battery life, and processing capabilities. Finally, emerging applications of vision and speech across disciplines such as affective computing, intelligent transportation, and precision medicine are discussed. To our knowledge, this paper provides one of the most comprehensive surveys on the latest developments in intelligent vision and speech applications from the perspectives of both software and hardware systems. Many of these emerging technologies using deep neural networks show tremendous promise to revolutionize research and development for future vision and speech systems.

摘要

本次调查对视觉和语音应用中最先进的深度神经网络架构、算法和系统进行了综述。深度人工神经网络算法和架构的最新进展推动了智能视觉和语音系统的快速创新与发展。随着大量传感器数据的可用性以及用于深度神经网络处理和训练的云计算,再加上移动和嵌入式技术日益成熟,下一代智能系统有望彻底改变个人和商业计算。本次调查首先介绍了迄今为止一些用于智能视觉和语音系统的最成功深度学习模型的背景和发展历程。概述了大规模工业研发工作,以强调智能视觉和语音系统的未来趋势和前景。强大且高效的智能系统在诸如移动设备、机器人和汽车等资源受限的硬件平台上需要低延迟和高保真度。因此,本次调查还总结了在硬件受限平台(即内存有限、电池续航和处理能力有限的情况下)运行深度神经网络的关键挑战和近期取得的成功。最后,讨论了视觉和语音在情感计算、智能交通和精准医学等跨学科领域的新兴应用。据我们所知,本文从软件和硬件系统的角度对智能视觉和语音应用的最新发展进行了最全面的调查之一。许多这些使用深度神经网络的新兴技术显示出巨大的潜力,有望彻底改变未来视觉和语音系统的研发。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/282a/7584105/e15a71e4a412/nihms-1616441-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/282a/7584105/40c78718dc95/nihms-1616441-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/282a/7584105/dbacc17ffeda/nihms-1616441-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/282a/7584105/89a1e10d5531/nihms-1616441-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/282a/7584105/ed2b0d2d702f/nihms-1616441-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/282a/7584105/4d1aa63d1754/nihms-1616441-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/282a/7584105/e15a71e4a412/nihms-1616441-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/282a/7584105/40c78718dc95/nihms-1616441-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/282a/7584105/dbacc17ffeda/nihms-1616441-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/282a/7584105/89a1e10d5531/nihms-1616441-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/282a/7584105/ed2b0d2d702f/nihms-1616441-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/282a/7584105/4d1aa63d1754/nihms-1616441-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/282a/7584105/e15a71e4a412/nihms-1616441-f0006.jpg

相似文献

1
Survey on Deep Neural Networks in Speech and Vision Systems.语音与视觉系统中的深度神经网络调查
Neurocomputing (Amst). 2020 Dec 5;417:302-321. doi: 10.1016/j.neucom.2020.07.053. Epub 2020 Jul 26.
2
An Overview of Machine Learning within Embedded and Mobile Devices-Optimizations and Applications.机器学习在嵌入式和移动设备中的概述——优化与应用。
Sensors (Basel). 2021 Jun 28;21(13):4412. doi: 10.3390/s21134412.
3
From Near-Sensor to In-Sensor: A State-of-the-Art Review of Embedded AI Vision Systems.从近传感器到传感器内:嵌入式人工智能视觉系统的最新综述
Sensors (Basel). 2024 Aug 22;24(16):5446. doi: 10.3390/s24165446.
4
Convolutional Neural Network Technology in Endoscopic Imaging: Artificial Intelligence for Endoscopy.内镜成像中的卷积神经网络技术:用于内镜检查的人工智能
Clin Endosc. 2020 Mar;53(2):117-126. doi: 10.5946/ce.2020.054. Epub 2020 Mar 30.
5
Neuromorphic Sentiment Analysis Using Spiking Neural Networks.基于尖峰神经网络的神经形态情绪分析。
Sensors (Basel). 2023 Sep 6;23(18):7701. doi: 10.3390/s23187701.
6
Advancements in Microprocessor Architecture for Ubiquitous AI-An Overview on History, Evolution, and Upcoming Challenges in AI Implementation.用于普适人工智能的微处理器架构进展——人工智能实施的历史、演进及未来挑战概述
Micromachines (Basel). 2021 Jun 6;12(6):665. doi: 10.3390/mi12060665.
7
New Trends in Emotion Recognition Using Image Analysis by Neural Networks, A Systematic Review.基于神经网络的图像分析的情绪识别新趋势:系统综述。
Sensors (Basel). 2023 Aug 10;23(16):7092. doi: 10.3390/s23167092.
8
A survey on deep learning applied to medical images: from simple artificial neural networks to generative models.关于深度学习应用于医学图像的综述:从简单人工神经网络到生成模型
Neural Comput Appl. 2023;35(3):2291-2323. doi: 10.1007/s00521-022-07953-4. Epub 2022 Nov 4.
9
Cancer Diagnosis Using Deep Learning: A Bibliographic Review.使用深度学习进行癌症诊断:文献综述
Cancers (Basel). 2019 Aug 23;11(9):1235. doi: 10.3390/cancers11091235.
10
Automatic Number Plate Recognition:A Detailed Survey of Relevant Algorithms.自动车牌识别:相关算法的详细调查。
Sensors (Basel). 2021 Apr 26;21(9):3028. doi: 10.3390/s21093028.

引用本文的文献

1
Emerging Techniques of Translational Research in Immuno-Oncology: A Focus on Non-Small Cell Lung Cancer.免疫肿瘤学转化研究的新兴技术:聚焦非小细胞肺癌
Cancers (Basel). 2025 Jul 4;17(13):2244. doi: 10.3390/cancers17132244.
2
Artificial intelligence-enabled innovations in cochlear implant technology: Advancing auditory prosthetics for hearing restoration.人工耳蜗植入技术中基于人工智能的创新:推动用于听力恢复的听觉假体发展。
Bioeng Transl Med. 2025 Jan 9;10(3):e10752. doi: 10.1002/btm2.10752. eCollection 2025 May.
3
Attention-based Imputation of Missing Values in Electronic Health Records Tabular Data.

本文引用的文献

1
ArcFace: Additive Angular Margin Loss for Deep Face Recognition.ArcFace:用于深度人脸识别的附加角度间隔损失。
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):5962-5979. doi: 10.1109/TPAMI.2021.3087709. Epub 2022 Sep 14.
2
Advanced machine learning in action: identification of intracranial hemorrhage on computed tomography scans of the head with clinical workflow integration.实用的高级机器学习:通过整合临床工作流程,在头部计算机断层扫描中识别颅内出血。
NPJ Digit Med. 2018 Apr 4;1:9. doi: 10.1038/s41746-017-0015-z. eCollection 2018.
3
Deep learning cardiac motion analysis for human survival prediction.
电子健康记录表格数据中基于注意力机制的缺失值插补
Proc (IEEE Int Conf Healthc Inform). 2024 Jun;2024:177-182. doi: 10.1109/ichi61247.2024.00030. Epub 2024 Aug 22.
4
Deep Clustering of Electronic Health Records Tabular Data for Clinical Interpretation.用于临床解读的电子健康记录表格数据深度聚类
IEEE Int Conf Telecommun Photonics. 2023 Dec;2023. doi: 10.1109/ictp60248.2023.10490723. Epub 2024 Apr 11.
5
To Compress or Not to Compress-Self-Supervised Learning and Information Theory: A Review.压缩还是不压缩——自监督学习与信息论:综述
Entropy (Basel). 2024 Mar 12;26(3):252. doi: 10.3390/e26030252.
6
AMENet is a monocular depth estimation network designed for automatic stereoscopic display.AMENet是一种为自动立体显示而设计的单目深度估计网络。
Sci Rep. 2024 Mar 11;14(1):5868. doi: 10.1038/s41598-024-56095-1.
7
Research on prognostic risk assessment model for acute ischemic stroke based on imaging and multidimensional data.基于影像学和多维度数据的急性缺血性卒中预后风险评估模型的研究
Front Neurol. 2023 Dec 19;14:1294723. doi: 10.3389/fneur.2023.1294723. eCollection 2023.
8
Artificial intelligence for healthcare and medical education: a systematic review.用于医疗保健和医学教育的人工智能:一项系统综述。
Am J Transl Res. 2023 Jul 15;15(7):4820-4828. eCollection 2023.
9
Efficient Binary Weight Convolutional Network Accelerator for Speech Recognition.用于语音识别的高效二进制权值卷积网络加速器。
Sensors (Basel). 2023 Jan 30;23(3):1530. doi: 10.3390/s23031530.
10
Methods for Gastrointestinal Endoscopy Quantification: A Focus on Hands and Fingers Kinematics.胃肠道内镜量化方法:手部和手指运动学的焦点。
Sensors (Basel). 2022 Nov 28;22(23):9253. doi: 10.3390/s22239253.
用于人类生存预测的深度学习心脏运动分析
Nat Mach Intell. 2019 Feb 11;1:95-104. doi: 10.1038/s42256-019-0019-2.
4
Deep Audio-Visual Speech Recognition.深度视听语音识别
IEEE Trans Pattern Anal Mach Intell. 2022 Dec;44(12):8717-8727. doi: 10.1109/TPAMI.2018.2889052. Epub 2022 Nov 7.
5
Automated Gleason grading of prostate cancer tissue microarrays via deep learning.基于深度学习的前列腺癌组织微阵列 Gleason 分级自动化。
Sci Rep. 2018 Aug 13;8(1):12054. doi: 10.1038/s41598-018-30535-1.
6
Detection and diagnosis of dental caries using a deep learning-based convolutional neural network algorithm.基于深度学习的卷积神经网络算法在龋齿检测和诊断中的应用。
J Dent. 2018 Oct;77:106-111. doi: 10.1016/j.jdent.2018.07.015. Epub 2018 Jul 26.
7
Deep Learning Techniques for Automatic MRI Cardiac Multi-Structures Segmentation and Diagnosis: Is the Problem Solved?深度学习技术在自动 MRI 心脏多结构分割与诊断中的应用:问题是否已解决?
IEEE Trans Med Imaging. 2018 Nov;37(11):2514-2525. doi: 10.1109/TMI.2018.2837502. Epub 2018 May 17.
8
Real-Time 3D Hand Pose Estimation with 3D Convolutional Neural Networks.基于3D卷积神经网络的实时3D手部姿态估计
IEEE Trans Pattern Anal Mach Intell. 2019 Apr;41(4):956-970. doi: 10.1109/TPAMI.2018.2827052. Epub 2018 Apr 16.
9
Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates.基于时空长短期记忆网络及信任门控的骨骼动作识别
IEEE Trans Pattern Anal Mach Intell. 2018 Dec;40(12):3007-3021. doi: 10.1109/TPAMI.2017.2771306. Epub 2017 Nov 9.
10
Deep learning applications in ophthalmology.深度学习在眼科中的应用。
Curr Opin Ophthalmol. 2018 May;29(3):254-260. doi: 10.1097/ICU.0000000000000470.