Fan Jianping, Zhao Tianyi, Kuang Zhenzhong, Zheng Yu, Zhang Ji, Yu Jun, Peng Jinye
IEEE Trans Image Process. 2017 Apr;26(4):1923-1938. doi: 10.1109/TIP.2017.2667405. Epub 2017 Feb 9.
In this paper, a hierarchical deep multi-task learning (HD-MTL) algorithm is developed to support large-scale visual recognition (e.g., recognizing thousands or even tens of thousands of atomic object classes automatically). First, multiple sets of multi-level deep features are extracted from different layers of deep convolutional neural networks (deep CNNs), and they are used to achieve more effective accomplishment of the coarseto- fine tasks for hierarchical visual recognition. A visual tree is then learned by assigning the visually-similar atomic object classes with similar learning complexities into the same group, which can provide a good environment for determining the interrelated learning tasks automatically. By leveraging the inter-task relatedness (inter-class similarities) to learn more discriminative group-specific deep representations, our deep multi-task learning algorithm can train more discriminative node classifiers for distinguishing the visually-similar atomic object classes effectively. Our hierarchical deep multi-task learning (HD-MTL) algorithm can integrate two discriminative regularization terms to control the inter-level error propagation effectively, and it can provide an end-to-end approach for jointly learning more representative deep CNNs (for image representation) and more discriminative tree classifier (for large-scale visual recognition) and updating them simultaneously. Our incremental deep learning algorithms can effectively adapt both the deep CNNs and the tree classifier to the new training images and the new object classes. Our experimental results have demonstrated that our HD-MTL algorithm can achieve very competitive results on improving the accuracy rates for large-scale visual recognition.
在本文中,我们开发了一种分层深度多任务学习(HD-MTL)算法,以支持大规模视觉识别(例如,自动识别数千甚至数万种原子对象类别)。首先,从深度卷积神经网络(深度CNN)的不同层中提取多组多级深度特征,并将它们用于更有效地完成分层视觉识别的从粗到细任务。然后,通过将具有相似学习复杂度的视觉相似原子对象类别分配到同一组中来学习视觉树,这可以为自动确定相关学习任务提供良好的环境。通过利用任务间相关性(类间相似性)来学习更具判别力的特定组深度表示,我们的深度多任务学习算法可以训练更具判别力的节点分类器,以有效地区分视觉相似的原子对象类别。我们的分层深度多任务学习(HD-MTL)算法可以集成两个判别正则化项,以有效地控制层间误差传播,并且它可以提供一种端到端的方法,用于联合学习更具代表性的深度CNN(用于图像表示)和更具判别力的树分类器(用于大规模视觉识别)并同时更新它们。我们的增量深度学习算法可以有效地使深度CNN和树分类器适应新的训练图像和新的对象类别。我们的实验结果表明,我们的HD-MTL算法在提高大规模视觉识别准确率方面可以取得非常有竞争力的结果。