Department of Bioinformatics and Biostatistics, School of Life Sciences &Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China.
Shanghai Center for Bioinformation Technology, Shanghai 200235, China.
Sci Rep. 2017 Mar 27;7:42775. doi: 10.1038/srep42775.
Long non-coding RNA overlapping with protein-coding gene (lncRNA-coding pair) is a special type of overlapping genes. Protein-coding overlapping genes have been well studied and increasing attention has been paid to lncRNAs. By studying lncRNA-coding pairs in human genome, we showed that lncRNA-coding pairs were more likely to be generated by overprinting and retaining genes in lncRNA-coding pairs were given higher priority than non-overlapping genes. Besides, the preference of overlapping configurations preserved during evolution was based on the origin of lncRNA-coding pairs. Further investigations showed that lncRNAs promoting the splicing of their embedded protein-coding partners was a unilateral interaction, but the existence of overlapping partners improving the gene expression was bidirectional and the effect was decreased with the increased evolutionary age of genes. Additionally, the expression of lncRNA-coding pairs showed an overall positive correlation and the expression correlation was associated with their overlapping configurations, local genomic environment and evolutionary age of genes. Comparison of the expression correlation of lncRNA-coding pairs between normal and cancer samples found that the lineage-specific pairs including old protein-coding genes may play an important role in tumorigenesis. This work presents a systematically comprehensive understanding of the evolution and the expression pattern of human lncRNA-coding pairs.
长链非编码 RNA 与蛋白质编码基因重叠(lncRNA-coding pair)是一种特殊的重叠基因。蛋白质编码重叠基因已经得到了很好的研究,越来越多的人开始关注 lncRNA。通过研究人类基因组中的 lncRNA-coding pair,我们发现 lncRNA-coding pair 更有可能通过覆盖产生,并且在 lncRNA-coding pair 中保留的基因比非重叠基因更受重视。此外,重叠配置的偏好性在进化过程中得以保留,这取决于 lncRNA-coding pair 的起源。进一步的研究表明,lncRNA 促进其嵌入的蛋白质编码伙伴的剪接是一种单向相互作用,但存在重叠伙伴可以提高基因表达,这是双向的,并且随着基因进化年龄的增加而减弱。此外,lncRNA-coding pair 的表达表现出总体上的正相关,并且表达相关性与其重叠配置、局部基因组环境和基因的进化年龄有关。比较正常和癌症样本中 lncRNA-coding pair 的表达相关性发现,包括旧蛋白质编码基因在内的谱系特异性对可能在肿瘤发生中发挥重要作用。这项工作系统全面地了解了人类 lncRNA-coding pair 的进化和表达模式。