School of Computing Science, Simon Fraser University, Burnaby, BC V5A1S6, Canada.
Vancouver Prostate Centre, Vancouver, BC V6H3Z6, Canada.
Bioinformatics. 2020 Jun 1;36(12):3703-3711. doi: 10.1093/bioinformatics/btaa232.
The ubiquitous abundance of circular RNAs (circRNAs) has been revealed by performing high-throughput sequencing in a variety of eukaryotes. circRNAs are related to some diseases, such as cancer in which they act as oncogenes or tumor-suppressors and, therefore, have the potential to be used as biomarkers or therapeutic targets. Accurate and rapid detection of circRNAs from short reads remains computationally challenging. This is due to the fact that identifying chimeric reads, which is essential for finding back-splice junctions, is a complex process. The sensitivity of discovery methods, to a high degree, relies on the underlying mapper that is used for finding chimeric reads. Furthermore, all the available circRNA discovery pipelines are resource intensive.
We introduce CircMiner, a novel stand-alone circRNA detection method that rapidly identifies and filters out linear RNA sequencing reads and detects back-splice junctions. CircMiner employs a rapid pseudo-alignment technique to identify linear reads that originate from transcripts, genes or the genome. CircMiner further processes the remaining reads to identify the back-splice junctions and detect circRNAs with single-nucleotide resolution. We evaluated the efficacy of CircMiner using simulated datasets generated from known back-splice junctions and showed that CircMiner has superior accuracy and speed compared to the existing circRNA detection tools. Additionally, on two RNase R treated cell line datasets, CircMiner was able to detect most of consistent, high confidence circRNAs compared to untreated samples of the same cell line.
CircMiner is implemented in C++ and is available online at https://github.com/vpc-ccg/circminer.
Supplementary data are available at Bioinformatics online.
通过在各种真核生物中进行高通量测序,揭示了普遍存在的环状 RNA(circRNA)的丰富性。circRNA 与一些疾病有关,例如癌症,它们作为癌基因或肿瘤抑制因子发挥作用,因此具有作为生物标志物或治疗靶点的潜力。从短读序列中准确快速地检测 circRNA 仍然具有计算挑战性。这是因为鉴定嵌合读段对于发现回文拼接接头是一个复杂的过程。发现方法的灵敏度在很大程度上依赖于用于发现嵌合读段的基础映射器。此外,所有可用的 circRNA 发现管道都需要大量资源。
我们引入了 CircMiner,这是一种新颖的独立 circRNA 检测方法,可快速识别和过滤线性 RNA 测序读段,并检测回文拼接接头。CircMiner 采用快速伪对齐技术来识别源自转录本、基因或基因组的线性读段。CircMiner 进一步处理剩余的读段,以识别回文拼接接头并以单核苷酸分辨率检测 circRNA。我们使用来自已知回文拼接接头的模拟数据集评估了 CircMiner 的功效,并表明 CircMiner 与现有的 circRNA 检测工具相比具有更高的准确性和速度。此外,在两个 RNase R 处理的细胞系数据集上,与同一细胞系未经处理的样本相比,CircMiner 能够检测到大多数一致的、高置信度的 circRNA。
CircMiner 是用 C++ 实现的,可以在 https://github.com/vpc-ccg/circminer 上在线获得。
补充数据可在生物信息学在线获得。