CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria.
Department of Laboratory Medicine, Medical University of Vienna, Vienna, Austria.
Genome Biol. 2020 Aug 3;21(1):190. doi: 10.1186/s13059-020-02100-5.
Deep learning has emerged as a versatile approach for predicting complex biological phenomena. However, its utility for biological discovery has so far been limited, given that generic deep neural networks provide little insight into the biological mechanisms that underlie a successful prediction. Here we demonstrate deep learning on biological networks, where every node has a molecular equivalent, such as a protein or gene, and every edge has a mechanistic interpretation, such as a regulatory interaction along a signaling pathway.
With knowledge-primed neural networks (KPNNs), we exploit the ability of deep learning algorithms to assign meaningful weights in multi-layered networks, resulting in a widely applicable approach for interpretable deep learning. We present a learning method that enhances the interpretability of trained KPNNs by stabilizing node weights in the presence of redundancy, enhancing the quantitative interpretability of node weights, and controlling for uneven connectivity in biological networks. We validate KPNNs on simulated data with known ground truth and demonstrate their practical use and utility in five biological applications with single-cell RNA-seq data for cancer and immune cells.
We introduce KPNNs as a method that combines the predictive power of deep learning with the interpretability of biological networks. While demonstrated here on single-cell sequencing data, this method is broadly relevant to other research areas where prior domain knowledge can be represented as networks.
深度学习已经成为一种预测复杂生物现象的通用方法。然而,由于通用的深度神经网络几乎无法深入了解成功预测背后的生物学机制,因此其在生物学发现中的应用一直受到限制。在这里,我们展示了生物网络上的深度学习,其中每个节点都有一个分子等价物,例如蛋白质或基因,每个边都有一个机制解释,例如信号通路中的调节相互作用。
我们利用知识引导神经网络(KPNN),利用深度学习算法在多层网络中分配有意义的权重的能力,从而为可解释的深度学习提供了一种广泛适用的方法。我们提出了一种学习方法,通过在存在冗余的情况下稳定节点权重来增强训练后的 KPNN 的可解释性,增强节点权重的定量可解释性,并控制生物网络中不均匀的连接。我们在具有已知真实情况的模拟数据上验证了 KPNN,并在五个具有单细胞 RNA-seq 数据的癌症和免疫细胞的生物学应用中展示了它们的实际用途和实用性。
我们介绍了 KPNN,它是一种将深度学习的预测能力与生物网络的可解释性相结合的方法。虽然这里是在单细胞测序数据上演示的,但该方法广泛适用于其他可以将先验领域知识表示为网络的研究领域。