Suppr超能文献

分析深度生成模型的训练过程。

Analyzing the Training Processes of Deep Generative Models.

出版信息

IEEE Trans Vis Comput Graph. 2018 Jan;24(1):77-87. doi: 10.1109/TVCG.2017.2744938. Epub 2017 Aug 29.

Abstract

Among the many types of deep models, deep generative models (DGMs) provide a solution to the important problem of unsupervised and semi-supervised learning. However, training DGMs requires more skill, experience, and know-how because their training is more complex than other types of deep models such as convolutional neural networks (CNNs). We develop a visual analytics approach for better understanding and diagnosing the training process of a DGM. To help experts understand the overall training process, we first extract a large amount of time series data that represents training dynamics (e.g., activation changes over time). A blue-noise polyline sampling scheme is then introduced to select time series samples, which can both preserve outliers and reduce visual clutter. To further investigate the root cause of a failed training process, we propose a credit assignment algorithm that indicates how other neurons contribute to the output of the neuron causing the training failure. Two case studies are conducted with machine learning experts to demonstrate how our approach helps understand and diagnose the training processes of DGMs. We also show how our approach can be directly used to analyze other types of deep models, such as CNNs.

摘要

在众多类型的深度学习模型中,深度生成模型 (DGM) 为无监督和半监督学习这一重要问题提供了一种解决方案。然而,训练 DGM 需要更多的技能、经验和专业知识,因为其训练比卷积神经网络 (CNN) 等其他类型的深度学习模型更为复杂。我们开发了一种可视化分析方法,以更好地理解和诊断 DGM 的训练过程。为了帮助专家了解整体训练过程,我们首先提取大量表示训练动态的数据(例如,随时间变化的激活变化)。然后引入蓝噪声折线抽样方案来选择时间序列样本,这样既可以保留异常值,又可以减少视觉混乱。为了进一步研究训练失败的根本原因,我们提出了一种信用分配算法,该算法指示其他神经元如何对导致训练失败的神经元的输出做出贡献。我们与机器学习专家进行了两个案例研究,以展示我们的方法如何帮助理解和诊断 DGM 的训练过程。我们还展示了如何直接将我们的方法用于分析其他类型的深度学习模型,例如 CNN。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验