Nair Asha A, Niu Nifang, Tang Xiaojia, Thompson Kevin J, Wang Liewei, Kocher Jean-Pierre, Subramanian Subbaya, Kalari Krishna R
Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA.
Division of Genomic and Molecular Pathology, University of Chicago, Chicago, IL, USA.
Oncotarget. 2016 Dec 6;7(49):80967-80979. doi: 10.18632/oncotarget.13134.
Circular RNAs (circRNAs) are highly stable forms of non-coding RNAs with diverse biological functions. They are implicated in modulation of gene expression thus affecting various cellular and disease processes. Based on existing bioinformatics approaches, we developed a comprehensive workflow called Circ-Seq to identify and report expressed circRNAs. Circ-Seq also provides informative genomic annotation along circRNA fused junctions thus allowing prioritization of circRNA candidates. We applied Circ-Seq first to RNA-sequence data from breast cancer cell lines and validated one of the large circRNAs identified. Circ-Seq was then applied to a larger cohort of breast cancer samples (n = 885) provided by The Cancer Genome Atlas (TCGA), including tumors and normal-adjacent tissue samples. Notably, circRNA results reveal that normal-adjacent tissues in estrogen receptor positive (ER+) subtype have relatively higher numbers of circRNAs than tumor samples in TCGA. Similar phenomenon of high circRNA numbers were observed in normal breast-mammary tissues from the Genotype-Tissue Expression (GTEx) project. Finally, we observed that number of circRNAs in normal-adjacent samples of ER+ subtype is inversely correlated to the risk-of-relapse proliferation (ROR-P) score for proliferating genes, suggesting that circRNA frequency may be a marker for cell proliferation in breast cancer. The Circ-Seq workflow will function for both single and multi-threaded compute environments. We believe that Circ-Seq will be a valuable tool to identify circRNAs useful in the diagnosis and treatment of other cancers and complex diseases.
环状RNA(circRNAs)是非编码RNA的高度稳定形式,具有多种生物学功能。它们参与基因表达的调控,从而影响各种细胞和疾病过程。基于现有的生物信息学方法,我们开发了一种名为Circ-Seq的综合工作流程,用于识别和报告表达的circRNAs。Circ-Seq还沿着circRNA融合接头提供信息丰富的基因组注释,从而能够对circRNA候选物进行优先级排序。我们首先将Circ-Seq应用于乳腺癌细胞系的RNA序列数据,并验证了所鉴定的一种大型circRNA。然后将Circ-Seq应用于由癌症基因组图谱(TCGA)提供的更大队列的乳腺癌样本(n = 885),包括肿瘤和癌旁组织样本。值得注意的是,circRNA结果显示,雌激素受体阳性(ER+)亚型的癌旁组织中的circRNAs数量比TCGA中的肿瘤样本相对更多。在基因型-组织表达(GTEx)项目的正常乳腺组织中也观察到了类似的circRNAs数量高的现象。最后,我们观察到ER+亚型的癌旁样本中的circRNAs数量与增殖基因的复发风险增殖(ROR-P)评分呈负相关,这表明circRNA频率可能是乳腺癌细胞增殖的一个标志物。Circ-Seq工作流程将在单线程和多线程计算环境中均起作用。我们相信Circ-Seq将成为识别对其他癌症和复杂疾病的诊断和治疗有用的circRNAs的有价值工具。