School of Life Sciences and Technology, Tongji University, 1239 Siping Road, Shanghai 200092, China.
School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China.
Gigascience. 2020 Jun 1;9(6). doi: 10.1093/gigascience/giaa054.
In cancer cells, fusion genes can produce linear and chimeric fusion-circular RNAs (f-circRNAs), which are functional in gene expression regulation and implicated in malignant transformation, cancer progression, and therapeutic resistance. For specific cancers, proteins encoded by fusion transcripts have been identified as innovative therapeutic targets (e.g., EML4-ALK). Even though RNA sequencing (RNA-Seq) technologies combined with existing bioinformatics approaches have enabled researchers to systematically identify fusion transcripts, specifically detecting f-circRNAs in cells remains challenging owing to their general sparsity and low abundance in cancer cells but also owing to imperfect computational methods.
We developed the Python-based workflow "Fcirc" to identify fusion linear and f-circRNAs from RNA-Seq data with high specificity. We applied Fcirc to 3 different types of RNA-Seq data scenarios: (i) actual synthetic spike-in RNA-Seq data, (ii) simulated RNA-Seq data, and (iii) actual cancer cell-derived RNA-Seq data. Fcirc showed significant advantages over existing methods regarding both detection accuracy (i.e., precision, recall, F-measure) and computing performance (i.e., lower runtimes).
Fcirc is a powerful and comprehensive Python-based pipeline to identify linear and circular RNA transcripts from known fusion events in RNA-Seq datasets with higher accuracy and shorter computing times compared with previously published algorithms. Fcirc empowers the research community to study the biology of fusion RNAs in cancer more effectively.
在癌细胞中,融合基因可以产生线性和嵌合融合环状 RNA(f-circRNA),它们在基因表达调控中具有功能,并与恶性转化、癌症进展和治疗耐药性有关。对于特定的癌症,融合转录本编码的蛋白质已被确定为创新的治疗靶点(例如,EML4-ALK)。尽管 RNA 测序(RNA-Seq)技术与现有的生物信息学方法相结合,使研究人员能够系统地识别融合转录本,但由于其在癌细胞中的普遍稀疏性和低丰度,以及计算方法的不完善,特异性检测细胞中的 f-circRNA 仍然具有挑战性。
我们开发了基于 Python 的工作流程“Fcirc”,用于从 RNA-Seq 数据中以高特异性识别融合线性和 f-circRNA。我们将 Fcirc 应用于 3 种不同类型的 RNA-Seq 数据场景:(i)实际合成 RNA-Seq 数据的 Spike-in,(ii)模拟 RNA-Seq 数据,和(iii)实际源自癌细胞的 RNA-Seq 数据。与之前发表的算法相比,Fcirc 在检测准确性(即精度、召回率、F 度量)和计算性能(即更低的运行时间)方面都具有显著优势。
Fcirc 是一种强大而全面的基于 Python 的管道,用于从 RNA-Seq 数据中识别已知融合事件的线性和环状 RNA 转录本,与之前发表的算法相比,其准确性更高,计算时间更短。Fcirc 使研究人员能够更有效地研究癌症中融合 RNA 的生物学。