Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK.
The Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN, UK.
Genome Biol. 2023 Aug 15;24(1):189. doi: 10.1186/s13059-023-03021-9.
The binding of transcription factors at proximal promoters and distal enhancers is central to gene regulation. Identifying regulatory motifs and quantifying their impact on expression remains challenging. Using a convolutional neural network trained on single-cell data, we infer putative regulatory motifs and cell type-specific importance. Our model, scover, explains 29% of the variance in gene expression in multiple mouse tissues. Applying scover to distal enhancers identified using scATAC-seq from the developing human brain, we identify cell type-specific motif activities in distal enhancers. Scover can identify regulatory motifs and their importance from single-cell data where all parameters and outputs are easily interpretable.
转录因子在近端启动子和远端增强子上的结合是基因调控的核心。鉴定调控基序并量化它们对表达的影响仍然具有挑战性。我们使用基于单细胞数据训练的卷积神经网络来推断可能的调控基序和细胞类型特异性重要性。我们的模型 scover 解释了多个小鼠组织中基因表达的 29%的方差。将 scover 应用于使用 scATAC-seq 从发育中的人脑鉴定的远端增强子,我们鉴定了远端增强子中细胞类型特异性的基序活性。scover 可以从单细胞数据中识别调控基序及其重要性,其中所有参数和输出都易于解释。