Center for Computational and Genomic Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA, 19104, USA.
Department of Computer Science and Technology, Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang, 150001, China.
Nat Commun. 2021 Jan 12;12(1):266. doi: 10.1038/s41467-020-20459-8.
Circular RNAs (circRNAs) have emerged as an important class of functional RNA molecules. Short-read RNA sequencing (RNA-seq) is a widely used strategy to identify circRNAs. However, an inherent limitation of short-read RNA-seq is that it does not experimentally determine the full-length sequences and exact exonic compositions of circRNAs. Here, we report isoCirc, a strategy for sequencing full-length circRNA isoforms, using rolling circle amplification followed by nanopore long-read sequencing. We describe an integrated computational pipeline to reliably characterize full-length circRNA isoforms using isoCirc data. Using isoCirc, we generate a comprehensive catalog of 107,147 full-length circRNA isoforms across 12 human tissues and one human cell line (HEK293), including 40,628 isoforms ≥500 nt in length. We identify widespread alternative splicing events within the internal part of circRNAs, including 720 retained intron events corresponding to a class of exon-intron circRNAs (EIciRNAs). Collectively, isoCirc and the companion dataset provide a useful strategy and resource for studying circRNAs in human transcriptomes.
环状 RNA(circRNAs)已成为一类重要的功能 RNA 分子。短读长 RNA 测序(RNA-seq)是一种广泛用于鉴定 circRNAs 的策略。然而,短读长 RNA-seq 的一个固有局限性是它不能在实验上确定 circRNAs 的全长序列和确切的外显子组成。在这里,我们报告了一种用于测序全长 circRNA 异构体的策略 isoCirc,该策略使用滚环扩增,然后进行纳米孔长读测序。我们描述了一个集成的计算管道,用于使用 isoCirc 数据可靠地表征全长 circRNA 异构体。使用 isoCirc,我们在 12 个人体组织和一个人源细胞系(HEK293)中生成了 107,147 个全长 circRNA 异构体的综合目录,其中包括 40,628 个长度≥500nt 的异构体。我们在 circRNAs 的内部部分鉴定出广泛的选择性剪接事件,包括 720 个对应的内含子保留事件,这些事件对应一类内含子-外显子 circRNAs(EIciRNAs)。总的来说,isoCirc 和配套数据集为研究人类转录组中的 circRNAs 提供了一种有用的策略和资源。