Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
BMC Bioinformatics. 2019 Jul 19;20(1):401. doi: 10.1186/s12859-019-2957-4.
Visualization tools for deep learning models typically focus on discovering key input features without considering how such low level features are combined in intermediate layers to make decisions. Moreover, many of these methods examine a network's response to specific input examples that may be insufficient to reveal the complexity of model decision making.
We present DeepResolve, an analysis framework for deep convolutional models of genome function that visualizes how input features contribute individually and combinatorially to network decisions. Unlike other methods, DeepResolve does not depend upon the analysis of a predefined set of inputs. Rather, it uses gradient ascent to stochastically explore intermediate feature maps to 1) discover important features, 2) visualize their contribution and interaction patterns, and 3) analyze feature sharing across tasks that suggests shared biological mechanism. We demonstrate the visualization of decision making using our proposed method on deep neural networks trained on both experimental and synthetic data. DeepResolve is competitive with existing visualization tools in discovering key sequence features, and identifies certain negative features and non-additive feature interactions that are not easily observed with existing tools. It also recovers similarities between poorly correlated classes which are not observed by traditional methods. DeepResolve reveals that DeepSEA's learned decision structure is shared across genome annotations including histone marks, DNase hypersensitivity, and transcription factor binding. We identify groups of TFs that suggest known shared biological mechanism, and recover correlation between DNA hypersensitivities and TF/Chromatin marks.
DeepResolve is capable of visualizing complex feature contribution patterns and feature interactions that contribute to decision making in genomic deep convolutional networks. It also recovers feature sharing and class similarities which suggest interesting biological mechanisms. DeepResolve is compatible with existing visualization tools and provides complementary insights.
深度学习模型的可视化工具通常侧重于发现关键输入特征,而不考虑中间层中这些低层次特征是如何组合以做出决策的。此外,许多这些方法研究了网络对特定输入示例的响应,这些示例可能不足以揭示模型决策的复杂性。
我们提出了 DeepResolve,这是一种用于基因组功能的深度卷积模型的分析框架,它可以可视化输入特征如何单独和组合地对网络决策做出贡献。与其他方法不同,DeepResolve 不依赖于对预定义输入集的分析。相反,它使用梯度上升来随机探索中间特征图,以 1)发现重要特征,2)可视化它们的贡献和相互作用模式,以及 3)分析跨任务的特征共享,这表明存在共享的生物学机制。我们在针对实验和合成数据训练的深度神经网络上使用我们提出的方法展示了决策的可视化。DeepResolve 在发现关键序列特征方面与现有的可视化工具具有竞争力,并确定了某些不易用现有工具观察到的负特征和非加性特征相互作用。它还恢复了传统方法无法观察到的相关性较差的类之间的相似性。DeepResolve 揭示了 DeepSEA 学习的决策结构在包括组蛋白标记、DNase 超敏性和转录因子结合在内的基因组注释中是共享的。我们确定了一组 TF,它们提示了已知的共享生物学机制,并恢复了 DNA 超敏性和 TF/染色质标记之间的相关性。
DeepResolve 能够可视化基因组深度卷积网络中决策的复杂特征贡献模式和特征相互作用。它还恢复了特征共享和类相似性,提示了有趣的生物学机制。DeepResolve 与现有的可视化工具兼容,并提供了补充的见解。