Fang Hui, Wang Victoria, Yamaguchi Motonori
Computer Science Department, Liverpool John Moores University, Liverpool L3 3AF, UK.
Institute for Criminal Justice Studies, University of Portsmouth, Portsmouth PO1 2HY, UK.
Entropy (Basel). 2018 Oct 26;20(11):823. doi: 10.3390/e20110823.
Deep Learning (DL) networks are recent revolutionary developments in artificial intelligence research. Typical networks are stacked by groups of layers that are further composed of many convolutional kernels or neurons. In network design, many hyper-parameters need to be defined heuristically before training in order to achieve high cross-validation accuracies. However, accuracy evaluation from the output layer alone is not sufficient to specify the roles of the hidden units in associated networks. This results in a significant knowledge gap between DL's wider applications and its limited theoretical understanding. To narrow the knowledge gap, our study explores visualization techniques to illustrate the mutual information (MI) in DL networks. The MI is a theoretical measurement, reflecting the relationship between two sets of random variables even if their relationship is highly non-linear and hidden in high-dimensional data. Our study aims to understand the roles of DL units in classification performance of the networks. Via a series of experiments using several popular DL networks, it shows that the visualization of MI and its change patterns between the input/output with the hidden layers and basic units can facilitate a better understanding of these DL units' roles. Our investigation on network convergence suggests a more objective manner to potentially evaluate DL networks. Furthermore, the visualization provides a useful tool to gain insights into the network performance, and thus to potentially facilitate the design of better network architectures by identifying redundancy and less-effective network units.
深度学习(DL)网络是人工智能研究中近期的革命性进展。典型的网络由多层堆叠而成,这些层又进一步由许多卷积核或神经元组成。在网络设计中,为了获得较高的交叉验证准确率,在训练之前需要凭经验定义许多超参数。然而,仅从输出层进行准确率评估不足以明确相关网络中隐藏单元的作用。这导致了深度学习在广泛应用与其有限的理论理解之间存在显著的知识差距。为了缩小这一知识差距,我们的研究探索了可视化技术,以阐明深度学习网络中的互信息(MI)。互信息是一种理论度量,即使两组随机变量之间的关系高度非线性且隐藏在高维数据中,它也能反映这两组随机变量之间的关系。我们的研究旨在了解深度学习单元在网络分类性能中的作用。通过使用几个流行的深度学习网络进行一系列实验,结果表明,互信息及其在输入/输出与隐藏层和基本单元之间的变化模式的可视化,有助于更好地理解这些深度学习单元的作用。我们对网络收敛性的研究提出了一种更客观的方式来潜在地评估深度学习网络。此外,可视化提供了一个有用的工具,以深入了解网络性能,从而通过识别冗余和效率较低的网络单元,潜在地促进更好的网络架构设计。