Microsoft, AI for Good Research Lab, Redmond, WA, USA.
Calico Life Sciences, South San Francisco, CA, USA.
Genome Biol. 2022 Aug 15;23(1):174. doi: 10.1186/s13059-022-02723-w.
We present a novel unsupervised deep learning approach called BindVAE, based on Dirichlet variational autoencoders, for jointly decoding multiple TF binding signals from open chromatin regions. BindVAE can disentangle an input DNA sequence into distinct latent factors that encode cell-type specific in vivo binding signals for individual TFs, composite patterns for TFs involved in cooperative binding, and genomic context surrounding the binding sites. On the task of retrieving the motifs of expressed TFs in a given cell type, BindVAE is competitive with existing motif discovery approaches.
我们提出了一种新的无监督深度学习方法,称为 BindVAE,它基于狄利克雷变分自动编码器,用于从开放染色质区域联合解码多个 TF 结合信号。BindVAE 可以将输入的 DNA 序列分解为不同的潜在因子,这些因子分别对单个 TF 的细胞类型特异性体内结合信号、参与协同结合的 TF 的组合模式以及结合位点周围的基因组环境进行编码。在检索给定细胞类型中表达 TF 的基序的任务中,BindVAE 与现有的基序发现方法具有竞争力。