Applied Tumor Genomics Research Program, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
Medicum, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
Nat Genet. 2022 Mar;54(3):283-294. doi: 10.1038/s41588-021-01009-4. Epub 2022 Feb 21.
DNA can determine where and when genes are expressed, but the full set of sequence determinants that control gene expression is unknown. Here, we measured the transcriptional activity of DNA sequences that represent an ~100 times larger sequence space than the human genome using massively parallel reporter assays (MPRAs). Machine learning models revealed that transcription factors (TFs) generally act in an additive manner with weak grammar and that most enhancers increase expression from a promoter by a mechanism that does not appear to involve specific TF-TF interactions. The enhancers themselves can be classified into three types: classical, closed chromatin and chromatin dependent. We also show that few TFs are strongly active in a cell, with most activities being similar between cell types. Individual TFs can have multiple gene regulatory activities, including chromatin opening and enhancing, promoting and determining transcription start site (TSS) activity, consistent with the view that the TF binding motif is the key atomic unit of gene expression.
DNA 可以决定基因在何时何地表达,但控制基因表达的完整序列决定因素尚不清楚。在这里,我们使用大规模平行报告基因分析(MPRAs)测量了代表人类基因组约 100 倍大的序列空间的 DNA 序列的转录活性。机器学习模型表明,转录因子(TFs)通常以弱语法的加性方式发挥作用,并且大多数增强子通过一种似乎不涉及特定 TF-TF 相互作用的机制增加启动子的表达。增强子本身可以分为三种类型:经典型、封闭染色质型和依赖染色质型。我们还表明,很少有 TF 在细胞中具有很强的活性,大多数活性在细胞类型之间相似。单个 TF 可以具有多种基因调控活性,包括染色质开放和增强、促进和决定转录起始位点(TSS)活性,这与 TF 结合基序是基因表达的关键原子单元的观点一致。