Department of Computer Science and Engineering, University of Connecticut, Storrs, 06269, CT, USA.
Department of Biomedical Informatics, Harvard Medical School, Boston, 02115, MA, USA.
BMC Genomics. 2018 Aug 13;19(Suppl 6):572. doi: 10.1186/s12864-018-4926-0.
While RNA is often created from linear splicing during transcription, recent studies have found that non-canonical splicing sometimes occurs. Non-canonical splicing joins 3' and 5' and forms the so-called circular RNA. It is now believed that circular RNA plays important biological roles such as affecting susceptibility of some diseases. During the past several years, multiple experimental methods have been developed to enrich circular RNA while degrade linear RNA. Although several useful software tools for circular RNA detection have been developed as well, these tools are based on reads mapping may miss many circular RNA. Also, existing tools are slow for large data due to their dependence on reads mapping.
In this paper, we present a new computational approach, named CircMarker, based on k-mers rather than reads mapping for circular RNA detection. CircMarker takes advantage of transcriptome annotation files to create the k-mer table for circular RNA detection.
Empirical results show that CircMarker outperforms existing tools in circular RNA detection on accuracy and efficiency in many simulated and real datasets.
We develop a new circular RNA detection method called CircMarker based on k-mer analysis. Our results on both simulation data and real data demonstrate that CircMarker runs much faster and can find more circular RNA with higher consensus-based sensitivity and high accuracy ratio compared with existing tools.
虽然 RNA 通常在转录过程中通过线性剪接产生,但最近的研究发现,非规范剪接有时也会发生。非规范剪接连接 3' 和 5',形成所谓的环状 RNA。现在人们认为环状 RNA 发挥着重要的生物学作用,例如影响某些疾病的易感性。在过去的几年中,已经开发出多种富集环状 RNA 同时降解线性 RNA 的实验方法。尽管已经开发出几种用于环状 RNA 检测的有用软件工具,但这些工具基于读段映射,可能会错过许多环状 RNA。此外,由于依赖于读段映射,现有工具对于大数据来说速度较慢。
在本文中,我们提出了一种新的计算方法,名为 CircMarker,用于环状 RNA 检测,它基于 k-mer 而不是读段映射。CircMarker 利用转录组注释文件来创建用于环状 RNA 检测的 k-mer 表。
实验结果表明,CircMarker 在许多模拟和真实数据集上的环状 RNA 检测的准确性和效率方面均优于现有工具。
我们开发了一种新的环状 RNA 检测方法,称为 CircMarker,基于 k-mer 分析。我们在模拟数据和真实数据上的结果表明,与现有工具相比,CircMarker 运行速度更快,可以发现更多的环状 RNA,具有更高的基于共识的敏感性和高准确性比率。