Suppr超能文献

用于协作图像分类的多路径x-D递归神经网络

Multi-path x-D Recurrent Neural Networks for Collaborative Image Classification.

作者信息

Gao Riqiang, Huo Yuankai, Bao Shunxing, Tang Yucheng, Antic Sanja L, Epstein Emily S, Deppen Steve, Paulson Alexis B, Sandler Kim L, Massion Pierre P, Landman Bennett A

机构信息

Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN, USA 37235, Vanderbilt University Medical Center, Nashville, TN, USA 37235.

出版信息

Neurocomputing (Amst). 2020 Jul 15;397:48-59. doi: 10.1016/j.neucom.2020.02.033. Epub 2020 Feb 15.

Abstract

With the rapid development of image acquisition and storage, multiple images per class are commonly available for computer vision tasks (e.g., face recognition, object detection, medical imaging, etc.). Recently, the recurrent neural network (RNN) has been widely integrated with convolutional neural networks (CNN) to perform image classification on ordered (sequential) data. In this paper, by permutating multiple images as multiple dummy orders, we generalize the ordered "RNN+CNN" design (longitudinal) to a novel unordered fashion, called Multi-path x-D Recurrent Neural Network (MxDRNN) for image classification. To the best of our knowledge, few (if any) existing studies have deployed the RNN framework to unordered intra-class images to leverage classification performance. Specifically, multiple learning paths are introduced in the MxDRNN to extract discriminative features by permutating input dummy orders. Eight datasets from five different fields (MNIST, 3D-MNIST, CIFAR, VGGFace2, and lung screening computed tomography) are included to evaluate the performance of our method. The proposed MxDRNN improves the baseline performance by a large margin across the different application fields (e.g., accuracy from 46.40% to 76.54% in VGGFace2 test pose set, AUC from 0.7418 to 0.8162 in NLST lung dataset). Additionally, empirical experiments show the MxDRNN is more robust to category-irrelevant attributes (e.g., expression, pose in face images), which may introduce difficulties for image classification and algorithm generalizability. The code is publicly available.

摘要

随着图像采集和存储的快速发展,对于计算机视觉任务(如人脸识别、目标检测、医学成像等),每个类别通常都有多个图像可用。最近,循环神经网络(RNN)已广泛与卷积神经网络(CNN)集成,以对有序(序列)数据进行图像分类。在本文中,通过将多个图像排列为多个虚拟顺序,我们将有序的“RNN+CNN”设计(纵向)推广为一种新颖的无序方式,称为用于图像分类的多路径x-D循环神经网络(MxDRNN)。据我们所知,很少(如果有的话)现有研究将RNN框架应用于无序的类内图像以提升分类性能。具体而言,MxDRNN中引入了多条学习路径,通过排列输入的虚拟顺序来提取判别性特征。我们纳入了来自五个不同领域的八个数据集(MNIST、3D-MNIST、CIFAR、VGGFace2和肺部筛查计算机断层扫描)来评估我们方法的性能。所提出的MxDRNN在不同应用领域中大幅提高了基线性能(例如,在VGGFace2测试姿态集中准确率从46.40%提高到76.54%,在NLST肺部数据集中AUC从0.7418提高到0.8162)。此外,实证实验表明MxDRNN对与类别无关的属性(如面部图像中的表情、姿态)更具鲁棒性,这些属性可能给图像分类和算法通用性带来困难。代码已公开可用。

相似文献

1
Multi-path x-D Recurrent Neural Networks for Collaborative Image Classification.用于协作图像分类的多路径x-D递归神经网络
Neurocomputing (Amst). 2020 Jul 15;397:48-59. doi: 10.1016/j.neucom.2020.02.033. Epub 2020 Feb 15.
5
RNN-based longitudinal analysis for diagnosis of Alzheimer's disease.基于 RNN 的阿尔茨海默病纵向分析诊断。
Comput Med Imaging Graph. 2019 Apr;73:1-10. doi: 10.1016/j.compmedimag.2019.01.005. Epub 2019 Jan 26.
6
Uncertainty handling in convolutional neural networks.卷积神经网络中的不确定性处理。
Neural Comput Appl. 2022;34(19):16753-16769. doi: 10.1007/s00521-022-07313-2. Epub 2022 Jun 18.

本文引用的文献

1
ArcFace: Additive Angular Margin Loss for Deep Face Recognition.ArcFace:用于深度人脸识别的附加角度间隔损失。
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):5962-5979. doi: 10.1109/TPAMI.2021.3087709. Epub 2022 Sep 14.
2
Hierarchical Deep Click Feature Prediction for Fine-Grained Image Recognition.层次化深度点击特征预测在细粒度图像识别中的应用。
IEEE Trans Pattern Anal Mach Intell. 2022 Feb;44(2):563-578. doi: 10.1109/TPAMI.2019.2932058. Epub 2022 Jan 7.
5
Spatial Pyramid-Enhanced NetVLAD With Weighted Triplet Loss for Place Recognition.用于地点识别的带加权三元组损失的空间金字塔增强NetVLAD
IEEE Trans Neural Netw Learn Syst. 2020 Feb;31(2):661-674. doi: 10.1109/TNNLS.2019.2908982. Epub 2019 Apr 26.
6
Deep Learning Predicts Lung Cancer Treatment Response from Serial Medical Imaging.深度学习从连续医学成像预测肺癌治疗反应。
Clin Cancer Res. 2019 Jun 1;25(11):3266-3275. doi: 10.1158/1078-0432.CCR-18-2495. Epub 2019 Apr 22.
7
Evaluate the Malignancy of Pulmonary Nodules Using the 3-D Deep Leaky Noisy-OR Network.利用三维深度渗漏噪声 OR 网络评估肺结节的恶性程度。
IEEE Trans Neural Netw Learn Syst. 2019 Nov;30(11):3484-3495. doi: 10.1109/TNNLS.2019.2892409. Epub 2019 Feb 14.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验