Stewart Greg L, Enfield Katey S S, Sage Adam P, Martinez Victor D, Minatel Brenda C, Pewarchuk Michelle E, Marshall Erin A, Lam Wan L
BC Cancer Research Centre, Vancouver, BC, Canada.
The Francis Crick Institute, London, United Kingdom.
Front Genet. 2019 Mar 6;10:138. doi: 10.3389/fgene.2019.00138. eCollection 2019.
Transcriptome sequencing has led to the widespread identification of long non-coding RNAs (lncRNAs). Subsequently, these genes have been shown to hold functional importance in human cellular biology, which can be exploited by tumors to drive the hallmarks of cancer. Due to the complex tertiary structure and unknown binding motifs of lncRNAs, there is a growing disparity between the number of lncRNAs identified and those that have been functionally characterized. As such, lncRNAs deregulated in cancer may represent critical components of cancer pathways that could serve as novel therapeutic intervention points. Pseudogenes are non-coding DNA sequences that are defunct relatives of their protein-coding parent genes but retain high sequence similarity. Interestingly, certain lncRNAs expressed from pseudogene loci have been shown to regulate the protein-coding parent genes of these pseudogenes in particularly because of this sequence complementarity. We hypothesize that this phenomenon occurs more broadly than previously realized, and that aberrant expression of lncRNAs overlapping pseudogene loci provides an alternative mechanism of cancer gene deregulation. Using RNA-sequencing data from two cohorts of lung adenocarcinoma, each paired with patient-matched non-malignant lung samples, we discovered 104 deregulated pseudogene-derived lncRNAs. Remarkably, many of these deregulated lncRNAs (i) were expressed from the loci of pseudogenes related to known cancer genes, (ii) had expression that significantly correlated with protein-coding parent gene expression, and (iii) had lncRNA protein-coding parent gene expression that was significantly associated with survival. Here, we uncover evidence to suggest the lncRNA-pseudogene-protein-coding gene axis as a prominent mechanism of cancer gene regulation in lung adenocarcinoma, and highlights the clinical utility of exploring the non-coding regions of the cancer transcriptome.
转录组测序已促使长链非编码RNA(lncRNA)被广泛识别。随后,这些基因已被证明在人类细胞生物学中具有重要功能,肿瘤可利用这些功能来驱动癌症的特征。由于lncRNA的三级结构复杂且结合基序未知,已识别的lncRNA数量与已进行功能表征的数量之间的差距越来越大。因此,在癌症中失调的lncRNA可能代表癌症通路的关键组成部分,可作为新的治疗干预靶点。假基因是非编码DNA序列,是其蛋白质编码亲本基因的无功能亲属,但保留高度的序列相似性。有趣的是,从假基因位点表达的某些lncRNA已被证明可调节这些假基因的蛋白质编码亲本基因,特别是由于这种序列互补性。我们假设这种现象比以前认识到的更广泛地发生,并且与假基因位点重叠的lncRNA的异常表达提供了癌症基因失调的另一种机制。使用来自两个肺腺癌队列的RNA测序数据,每个队列都与患者匹配的非恶性肺样本配对,我们发现了104个失调的假基因衍生lncRNA。值得注意的是,这些失调的lncRNA中有许多(i)从与已知癌症基因相关的假基因位点表达,(ii)其表达与蛋白质编码亲本基因表达显著相关,并且(iii)lncRNA蛋白质编码亲本基因表达与生存显著相关。在这里,我们发现证据表明lncRNA-假基因-蛋白质编码基因轴是肺腺癌中癌症基因调控的一个突出机制,并强调了探索癌症转录组非编码区域的临床实用性。