Graduate Group in Biomedical Engineering, University of California, Davis, Davis, CA, USA.
Genome Center, University of California, Davis, Davis, CA, USA.
Genome Biol. 2023 Feb 20;24(1):29. doi: 10.1186/s13059-023-02850-y.
Neural networks such as variational autoencoders (VAE) perform dimensionality reduction for the visualization and analysis of genomic data, but are limited in their interpretability: it is unknown which data features are represented by each embedding dimension. We present siVAE, a VAE that is interpretable by design, thereby enhancing downstream analysis tasks. Through interpretation, siVAE also identifies gene modules and hubs without explicit gene network inference. We use siVAE to identify gene modules whose connectivity is associated with diverse phenotypes such as iPSC neuronal differentiation efficiency and dementia, showcasing the wide applicability of interpretable generative models for genomic data analysis.
神经网络,如变分自动编码器 (VAE),可用于基因组数据的可视化和分析的降维,但在可解释性方面存在局限性:未知每个嵌入维度代表哪些数据特征。我们提出了 siVAE,这是一种设计上可解释的 VAE,从而增强了下游分析任务。通过解释,siVAE 还可以识别基因模块和枢纽,而无需进行显式的基因网络推断。我们使用 siVAE 来识别与多种表型相关的基因模块,例如 iPSC 神经元分化效率和痴呆症,展示了可解释的生成模型在基因组数据分析中的广泛适用性。