利用信息瓶颈评估深度神经网络的图像分类能力。

Utilizing Information Bottleneck to Evaluate the Capability of Deep Neural Networks for Image Classification.

作者信息

Cheng Hao, Lian Dongze, Gao Shenghua, Geng Yanlin

机构信息

Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China.

University of Chinese Academy of Sciences, Beijing 100049, China.

出版信息

Entropy (Basel). 2019 May 1;21(5):456. doi: 10.3390/e21050456.

DOI:10.3390/e21050456

PMID:33267170

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7514945/

Abstract

Inspired by the pioneering work of the information bottleneck (IB) principle for Deep Neural Networks' (DNNs) analysis, we thoroughly study the relationship among the model accuracy, I ( X ; T ) and I ( T ; Y ) , where I ( X ; T ) and I ( T ; Y ) are the mutual information of DNN's output with input and label . Then, we design an information plane-based framework to evaluate the capability of DNNs (including CNNs) for image classification. Instead of each hidden layer's output, our framework focuses on the model output . We successfully apply our framework to many application scenarios arising in deep learning and image classification problems, such as image classification with unbalanced data distribution, model selection, and transfer learning. The experimental results verify the effectiveness of the information plane-based framework: Our framework may facilitate a quick model selection and determine the number of samples needed for each class in the unbalanced classification problem. Furthermore, the framework explains the efficiency of transfer learning in the deep learning area.

摘要

受信息瓶颈（IB）原理在深度神经网络（DNN）分析方面开创性工作的启发，我们深入研究了模型准确率、I(X;T)和I(T;Y)之间的关系，其中I(X;T)和I(T;Y)分别是DNN输出与输入和标签之间的互信息。然后，我们设计了一个基于信息平面的框架来评估DNN（包括CNN）的图像分类能力。我们的框架关注的是模型输出，而非每个隐藏层的输出。我们成功地将我们的框架应用于深度学习和图像分类问题中出现的许多应用场景，如不平衡数据分布的图像分类、模型选择和迁移学习。实验结果验证了基于信息平面的框架的有效性：我们的框架可能有助于快速进行模型选择，并确定不平衡分类问题中每个类所需的样本数量。此外，该框架解释了深度学习领域中迁移学习的效率。