Genomics Research Center, Academia Sinica, Taipei, Taiwan.
Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan.
Nucleic Acids Res. 2024 Jan 5;52(D1):D115-D123. doi: 10.1093/nar/gkad829.
Circular RNAs (circRNAs) are RNA molecules with a continuous loop structure characterized by back-splice junctions (BSJs). While analyses of short-read RNA sequencing have identified millions of BSJ events, it is inherently challenging to determine exact full-length sequences and alternatively spliced (AS) isoforms of circRNAs. Recent advances in nanopore long-read sequencing with circRNA enrichment bring an unprecedented opportunity for investigating the issues. Here, we developed FL-circAS (https://cosbi.ee.ncku.edu.tw/FL-circAS/), which collected such long-read sequencing data of 20 cell lines/tissues and thereby identified 884 636 BSJs with 1 853 692 full-length circRNA isoforms in human and 115 173 BSJs with 135 617 full-length circRNA isoforms in mouse. FL-circAS also provides multiple circRNA features. For circRNA expression, FL-circAS calculates expression levels for each circRNA isoform, cell line/tissue specificity at both the BSJ and isoform levels, and AS entropy for each BSJ across samples. For circRNA biogenesis, FL-circAS identifies reverse complementary sequences and RNA binding protein (RBP) binding sites residing in flanking sequences of BSJs. For functional patterns, FL-circAS identifies potential microRNA/RBP binding sites and several types of evidence for circRNA translation on each full-length circRNA isoform. FL-circAS provides user-friendly interfaces for browsing, searching, analyzing, and downloading data, serving as the first resource for discovering full-length circRNAs at the isoform level.
环形 RNA(circRNA)是一种具有连续环结构的 RNA 分子,其特征在于具有回文拼接接头(BSJ)。虽然短读长 RNA 测序分析已经鉴定了数百万个 BSJ 事件,但确定 circRNA 的精确全长序列和可变剪接(AS)异构体本质上具有挑战性。最近在具有 circRNA 富集的纳米孔长读长测序方面的进展为研究这些问题带来了前所未有的机会。在这里,我们开发了 FL-circAS(https://cosbi.ee.ncku.edu.tw/FL-circAS/),该软件收集了 20 种细胞系/组织的此类长读长测序数据,从而在人类中鉴定了 884636 个 BSJ 和 1853692 个全长 circRNA 异构体,在小鼠中鉴定了 115173 个 BSJ 和 135617 个全长 circRNA 异构体。FL-circAS 还提供了多种 circRNA 特征。对于 circRNA 的表达,FL-circAS 计算每个 circRNA 异构体的表达水平、BSJ 和异构体水平的细胞系/组织特异性,以及样本中每个 BSJ 的 AS 熵。对于 circRNA 生物发生,FL-circAS 识别位于 BSJ 侧翼序列中的反向互补序列和 RNA 结合蛋白(RBP)结合位点。对于功能模式,FL-circAS 识别每个全长 circRNA 异构体上的潜在 microRNA/RBP 结合位点和几种 circRNA 翻译的证据类型。FL-circAS 提供了用于浏览、搜索、分析和下载数据的用户友好界面,是在异构体水平上发现全长 circRNA 的第一个资源。