基因组位置保守性鉴定与发育基因座相关的拓扑锚定 RNA。
Genomic positional conservation identifies topological anchor point RNAs linked to developmental loci.
机构信息
The Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QN, UK.
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
出版信息
Genome Biol. 2018 Mar 15;19(1):32. doi: 10.1186/s13059-018-1405-5.
BACKGROUND
The mammalian genome is transcribed into large numbers of long noncoding RNAs (lncRNAs), but the definition of functional lncRNA groups has proven difficult, partly due to their low sequence conservation and lack of identified shared properties. Here we consider promoter conservation and positional conservation as indicators of functional commonality.
RESULTS
We identify 665 conserved lncRNA promoters in mouse and human that are preserved in genomic position relative to orthologous coding genes. These positionally conserved lncRNA genes are primarily associated with developmental transcription factor loci with which they are coexpressed in a tissue-specific manner. Over half of positionally conserved RNAs in this set are linked to chromatin organization structures, overlapping binding sites for the CTCF chromatin organiser and located at chromatin loop anchor points and borders of topologically associating domains (TADs). We define these RNAs as topological anchor point RNAs (tapRNAs). Characterization of these noncoding RNAs and their associated coding genes shows that they are functionally connected: they regulate each other's expression and influence the metastatic phenotype of cancer cells in vitro in a similar fashion. Furthermore, we find that tapRNAs contain conserved sequence domains that are enriched in motifs for zinc finger domain-containing RNA-binding proteins and transcription factors, whose binding sites are found mutated in cancers.
CONCLUSIONS
This work leverages positional conservation to identify lncRNAs with potential importance in genome organization, development and disease. The evidence that many developmental transcription factors are physically and functionally connected to lncRNAs represents an exciting stepping-stone to further our understanding of genome regulation.
背景
哺乳动物基因组转录生成大量长链非编码 RNA(lncRNA),但功能 lncRNA 组的定义一直难以确定,部分原因是它们的序列保守性低,且缺乏已鉴定的共享特性。在这里,我们将启动子保守性和位置保守性作为功能共性的指标。
结果
我们在小鼠和人类中鉴定出 665 个保守的 lncRNA 启动子,它们相对于直系同源编码基因在基因组位置上是保守的。这些位置保守的 lncRNA 基因主要与发育转录因子基因座相关,它们在组织特异性表达模式中与这些基因座共表达。在这个集合中,超过一半的位置保守 RNA 与染色质组织结构相关,与 CTCF 染色质组织者结合位点重叠,并位于染色质环锚点和拓扑关联域(TAD)边界。我们将这些 RNA 定义为拓扑锚点 RNA(tapRNA)。这些非编码 RNA 及其相关编码基因的特征表明它们在功能上是相互关联的:它们相互调节对方的表达,并以相似的方式影响体外癌细胞的转移表型。此外,我们发现 tapRNA 包含保守的序列结构域,这些结构域富含富含锌指结构域的 RNA 结合蛋白和转录因子的基序,其结合位点在癌症中发现发生了突变。
结论
这项工作利用位置保守性来鉴定在基因组组织、发育和疾病中具有潜在重要性的 lncRNA。许多发育转录因子在物理和功能上与 lncRNA 相连的证据,为进一步了解基因组调控提供了一个令人兴奋的切入点。