Nucleic Acids Res. 2017 Feb 28;45(4):e16. doi: 10.1093/nar/gkw942.
Recent large-scale genomics efforts to characterize the cis-regulatory sequences that orchestrate genome-wide expression patterns have produced impressive catalogues of putative regulatory elements. Most of these sequences have not been functionally tested, and our limited understanding of the non-coding genome prevents us from predicting which sequences are bona fide cis-regulatory elements. Recently, massively parallel reporter assays (MPRAs) have been deployed to measure the activity of putative cis-regulatory sequences in several biological contexts, each with specific advantages and distinct limitations. We developed LV-MPRA, a novel lentiviral-based, massively parallel reporter gene assay, to study the function of genome-integrated regulatory elements in any mammalian cell type; thus, making it possible to apply MPRAs in more biologically relevant contexts. We measured the activity of 2,600 sequences in U87 glioblastoma cells and human neural progenitor cells (hNPCs) and explored how regulatory activity is encoded in DNA sequence. We demonstrate that LV-MPRA can be applied to estimate the effects of local DNA sequence and regional chromatin on regulatory activity. Our data reveal that primary DNA sequence features, such as GC content and dinucleotide composition, accurately distinguish sequences with high activity from sequences with low activity in a full chromosomal context, and may also function in combination with different transcription factor binding sites to determine cell type specificity. We conclude that LV-MPRA will be an important tool for identifying cis-regulatory elements and stimulating new understanding about how the non-coding genome encodes information.
最近,大规模的基因组学研究旨在描述协调全基因组表达模式的顺式调控序列,已经产生了令人印象深刻的假定调控元件目录。这些序列大多数尚未经过功能测试,并且我们对非编码基因组的有限了解阻止了我们预测哪些序列是真正的顺式调控元件。最近,大规模平行报告基因分析(MPRAs)已被部署用于在几个生物学背景下测量假定顺式调控序列的活性,每种方法都有其特定的优势和独特的局限性。我们开发了 LV-MPRA,一种新型基于慢病毒的大规模平行报告基因分析,用于研究任何哺乳动物细胞类型中基因组整合调控元件的功能;因此,有可能在更具生物学相关性的背景下应用 MPRAs。我们测量了 U87 神经胶质瘤细胞和人神经祖细胞(hNPC)中 2600 个序列的活性,并探索了 DNA 序列如何编码调控活性。我们证明 LV-MPRA 可用于估计局部 DNA 序列和区域染色质对调控活性的影响。我们的数据表明,初级 DNA 序列特征,如 GC 含量和二核苷酸组成,在全染色体背景下可以准确地区分高活性序列和低活性序列,并且可能还与不同的转录因子结合位点结合,以确定细胞类型特异性。我们得出结论,LV-MPRA 将是识别顺式调控元件的重要工具,并为了解非编码基因组如何编码信息提供新的认识。