Data Science Institute Imperial College London, SW7 2AZ London, UK.
Department of Health Informatics University College London, WC1E 6BT London, UK.
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab315.
The lack of explainability is one of the most prominent disadvantages of deep learning applications in omics. This 'black box' problem can undermine the credibility and limit the practical implementation of biomedical deep learning models. Here we present XOmiVAE, a variational autoencoder (VAE)-based interpretable deep learning model for cancer classification using high-dimensional omics data. XOmiVAE is capable of revealing the contribution of each gene and latent dimension for each classification prediction and the correlation between each gene and each latent dimension. It is also demonstrated that XOmiVAE can explain not only the supervised classification but also the unsupervised clustering results from the deep learning network. To the best of our knowledge, XOmiVAE is one of the first activation level-based interpretable deep learning models explaining novel clusters generated by VAE. The explainable results generated by XOmiVAE were validated by both the performance of downstream tasks and the biomedical knowledge. In our experiments, XOmiVAE explanations of deep learning-based cancer classification and clustering aligned with current domain knowledge including biological annotation and academic literature, which shows great potential for novel biomedical knowledge discovery from deep learning models.
深度学习在组学应用中缺乏可解释性是其最显著的缺点之一。这个“黑箱”问题可能会降低生物医学深度学习模型的可信度,并限制其实际应用。在这里,我们提出了 XOmiVAE,这是一种基于变分自动编码器(VAE)的可解释深度学习模型,用于使用高维组学数据进行癌症分类。XOmiVAE 能够揭示每个基因和每个分类预测的潜在维度的贡献,以及每个基因和每个潜在维度之间的相关性。此外,还证明了 XOmiVAE 不仅可以解释有监督的分类,还可以解释来自深度学习网络的无监督聚类结果。据我们所知,XOmiVAE 是第一个基于激活水平的可解释深度学习模型之一,可以解释 VAE 生成的新簇。通过下游任务的性能和生物医学知识对 XOmiVAE 生成的可解释结果进行了验证。在我们的实验中,基于深度学习的癌症分类和聚类的 XOmiVAE 解释与包括生物学注释和学术文献在内的当前领域知识一致,这表明从深度学习模型中发现新的生物医学知识具有很大的潜力。