Department of Biomedical Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China; School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China.
State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing 211166, China; Center of Pathology and Clinical Laboratory, Sir Run Run Hospital, Nanjing Medical University, Nanjing 211166, China.
Genomics Proteomics Bioinformatics. 2019 Oct;17(5):522-534. doi: 10.1016/j.gpb.2019.03.004. Epub 2020 Jan 31.
Circular RNAs (circRNAs), covalently closed continuous RNA loops, are generated from cognate linear RNAs through back splicing events, and alternative splicing events may generate different circRNA isoforms at the same locus. However, the challenges of reconstruction and quantification of alternatively spliced full-length circRNAs remain unresolved. On the basis of the internal structural characteristics of circRNAs, we developed CircAST, a tool to assemble alternatively spliced circRNA transcripts and estimate their expression by using multiple splice graphs. Simulation studies showed that CircAST correctly assembled the full sequences of circRNAs with a sensitivity of 85.63%-94.32% and a precision of 81.96%-87.55%. By assigning reads to specific isoforms, CircAST quantified the expression of circRNA isoforms with correlation coefficients of 0.85-0.99 between theoretical and estimated values. We evaluated CircAST on an in-house mouse testis RNA-seq dataset with RNase R treatment for enriching circRNAs and identified 380 circRNAs with full-length sequences different from those of their corresponding cognate linear RNAs. RT-PCR and Sanger sequencing analyses validated 32 out of 37 randomly selected isoforms, thus further indicating the good performance of CircAST, especially for isoforms with low abundance. We also applied CircAST to published experimental data and observed substantial diversity in circular transcripts across samples, thus suggesting that circRNA expression is highly regulated. CircAST can be accessed freely at https://github.com/xiaofengsong/CircAST.
环状 RNA(circRNAs)是通过反式剪接事件从同源线性 RNA 产生的共价闭合的连续 RNA 环,并且选择性剪接事件可能在同一基因座产生不同的 circRNA 异构体。然而,替代性剪接全长 circRNA 异构体的重建和定量仍然存在挑战。基于 circRNAs 的内部结构特征,我们开发了 CircAST,这是一种通过使用多个剪接图组装替代性剪接 circRNA 转录本并估计其表达的工具。模拟研究表明,CircAST 以 85.63%-94.32%的灵敏度和 81.96%-87.55%的精度正确组装了 circRNA 的全长序列。通过将读取分配给特定的异构体,CircAST 以理论值和估计值之间的相关系数为 0.85-0.99 量化了 circRNA 异构体的表达。我们在一个内部的小鼠睾丸 RNA-seq 数据集上评估了 CircAST,该数据集经过 RNase R 处理以富集 circRNAs,并鉴定了 380 个具有与相应同源线性 RNA 不同全长序列的 circRNAs。随机选择的 37 个异构体中的 32 个进行 RT-PCR 和 Sanger 测序分析验证,从而进一步表明 CircAST 的性能良好,特别是对于丰度较低的异构体。我们还将 CircAST 应用于已发表的实验数据,并观察到样本之间环状转录物的多样性很大,这表明 circRNA 的表达受到高度调控。CircAST 可在 https://github.com/xiaofengsong/CircAST 上免费访问。