Suppr超能文献

用于大规模视觉识别的深度多样专家混合模型

Deep Mixture of Diverse Experts for Large-Scale Visual Recognition.

作者信息

Zhao Tianyi, Chen Qiuyu, Kuang Zhenzhong, Yu Jun, Zhang Wei, Fan Jianping

出版信息

IEEE Trans Pattern Anal Mach Intell. 2019 May;41(5):1072-1087. doi: 10.1109/TPAMI.2018.2828821. Epub 2018 Apr 20.

Abstract

In this paper, a deep mixture of diverse experts algorithm is developed to achieve more efficient learning of a huge (mixture) network for large-scale visual recognition application. First, a two-layer ontology is constructed to assign large numbers of atomic object classes into a set of task groups according to the similarities of their learning complexities, where certain degrees of inter-group task overlapping are allowed to enable sufficient inter-group message passing. Second, one particular base deep CNNs with M+1 outputs is learned for each task group to recognize its M atomic object classes and identify one special class of "not-in-group", where the network structure (numbers of layers and units in each layer) of the well-designed deep CNNs (such as AlexNet, VGG, GoogleNet, ResNet) is directly used to configure such base deep CNNs. For enhancing the separability of the atomic object classes in the same task group, two approaches are developed to learn more discriminative base deep CNNs: (a) our deep multi-task learning algorithm that can effectively exploit the inter-class visual similarities; (b) our two-layer network cascade approach that can improve the accuracy rates for the hard object classes at certain degrees while effectively maintaining the high accuracy rates for the easy ones. Finally, all these complementary base deep CNNs with diverse but overlapped outputs are seamlessly combined to generate a mixture network with larger outputs for recognizing tens of thousands of atomic object classes. Our experimental results have demonstrated that our deep mixture of diverse experts algorithm can achieve very competitive results on large-scale visual recognition.

摘要

本文提出了一种深度多专家混合算法,以实现对用于大规模视觉识别应用的庞大(混合)网络更高效的学习。首先,构建一个两层本体,根据大量原子对象类学习复杂度的相似性将它们分配到一组任务组中,其中允许组间任务有一定程度的重叠,以实现充分的组间信息传递。其次,为每个任务组学习一个具有M + 1个输出的特定基础深度卷积神经网络(CNN),以识别其M个原子对象类并识别一个特殊的“不在组内”类,其中精心设计的深度CNN(如AlexNet、VGG、GoogleNet、ResNet)的网络结构(层数和每层的单元数)直接用于配置此类基础深度CNN。为了增强同一任务组中原子对象类的可分离性,开发了两种方法来学习更具判别力的基础深度CNN:(a)我们的深度多任务学习算法,它可以有效地利用类间视觉相似性;(b)我们的两层网络级联方法,它可以在一定程度上提高难识别对象类的准确率,同时有效地保持易识别对象类的高准确率。最后,将所有这些具有不同但重叠输出的互补基础深度CNN无缝组合,生成一个具有更大输出的混合网络,用于识别数万个原子对象类。我们的实验结果表明,我们的深度多专家混合算法在大规模视觉识别方面可以取得非常有竞争力的结果。

相似文献

1
Deep Mixture of Diverse Experts for Large-Scale Visual Recognition.用于大规模视觉识别的深度多样专家混合模型
IEEE Trans Pattern Anal Mach Intell. 2019 May;41(5):1072-1087. doi: 10.1109/TPAMI.2018.2828821. Epub 2018 Apr 20.
2
HD-MTL: Hierarchical Deep Multi-Task Learning for Large-Scale Visual Recognition.HD-MTL:用于大规模视觉识别的分层深度多任务学习
IEEE Trans Image Process. 2017 Apr;26(4):1923-1938. doi: 10.1109/TIP.2017.2667405. Epub 2017 Feb 9.
6
Multi-View 3D Object Retrieval With Deep Embedding Network.基于深度嵌入网络的多视图三维目标检索
IEEE Trans Image Process. 2016 Dec;25(12):5526-5537. doi: 10.1109/TIP.2016.2609814. Epub 2016 Sep 15.
8
Scattering Networks for Hybrid Representation Learning.用于混合表示学习的散射网络。
IEEE Trans Pattern Anal Mach Intell. 2019 Sep;41(9):2208-2221. doi: 10.1109/TPAMI.2018.2855738. Epub 2018 Jul 19.
9
Deep Neural Network Compression by In-Parallel Pruning-Quantization.通过并行剪枝-量化实现深度神经网络压缩。
IEEE Trans Pattern Anal Mach Intell. 2020 Mar;42(3):568-579. doi: 10.1109/TPAMI.2018.2886192. Epub 2018 Dec 12.

引用本文的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验