Suppr超能文献

三种不同人类细胞类型中转录调控元件的大规模平行表征

Massively parallel characterization of transcriptional regulatory elements in three diverse human cell types.

作者信息

Agarwal Vikram, Inoue Fumitaka, Schubach Max, Martin Beth K, Dash Pyaree Mohan, Zhang Zicong, Sohota Ajuni, Noble William Stafford, Yardimci Galip Gürkan, Kircher Martin, Shendure Jay, Ahituv Nadav

机构信息

Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.

mRNA Center of Excellence, Sanofi Pasteur Inc., Waltham, MA 02451, USA.

出版信息

bioRxiv. 2023 Mar 6:2023.03.05.531189. doi: 10.1101/2023.03.05.531189.

Abstract

The human genome contains millions of candidate -regulatory elements (CREs) with cell-type-specific activities that shape both health and myriad disease states. However, we lack a functional understanding of the sequence features that control the activity and cell-type-specific features of these CREs. Here, we used lentivirus-based massively parallel reporter assays (lentiMPRAs) to test the regulatory activity of over 680,000 sequences, representing a nearly comprehensive set of all annotated CREs among three cell types (HepG2, K562, and WTC11), finding 41.7% to be functional. By testing sequences in both orientations, we find promoters to have significant strand orientation effects. We also observe that their 200 nucleotide cores function as non-cell-type-specific 'on switches' providing similar expression levels to their associated gene. In contrast, enhancers have weaker orientation effects, but increased tissue-specific characteristics. Utilizing our lentiMPRA data, we develop sequence-based models to predict CRE function with high accuracy and delineate regulatory motifs. Testing an additional lentiMPRA library encompassing 60,000 CREs in all three cell types, we further identified factors that determine cell-type specificity. Collectively, our work provides an exhaustive catalog of functional CREs in three widely used cell lines, and showcases how large-scale functional measurements can be used to dissect regulatory grammar.

摘要

人类基因组包含数百万个具有细胞类型特异性活性的候选调控元件(CRE),这些元件塑造了健康和无数疾病状态。然而,我们对控制这些CRE活性和细胞类型特异性特征的序列特征缺乏功能上的理解。在这里,我们使用基于慢病毒的大规模平行报告基因检测(lentiMPRA)来测试超过680,000个序列的调控活性,这些序列代表了三种细胞类型(HepG2、K562和WTC11)中几乎所有注释的CRE的完整集合,发现41.7%具有功能。通过在两个方向上测试序列,我们发现启动子具有显著的链方向效应。我们还观察到它们的200个核苷酸核心作为非细胞类型特异性的“开启开关”,与其相关基因提供相似的表达水平。相比之下,增强子的方向效应较弱,但组织特异性特征增加。利用我们的lentiMPRA数据,我们开发了基于序列的模型来高精度预测CRE功能并描绘调控基序。在所有三种细胞类型中测试包含60,000个CRE的另一个lentiMPRA文库,我们进一步确定了决定细胞类型特异性的因素。总的来说,我们的工作提供了三种广泛使用的细胞系中功能性CRE的详尽目录,并展示了如何使用大规模功能测量来剖析调控语法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04ac/10028905/fe0a1e7018c2/nihpp-2023.03.05.531189v1-f0006.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验