Biomolecular Engineering Department, UC Santa Cruz, California 95064, USA.
New England Biolabs, Ipswich, Massachusetts 01938, USA.
RNA. 2022 Feb;28(2):162-176. doi: 10.1261/rna.078703.121. Epub 2021 Nov 2.
Nanopore sequencing devices read individual RNA strands directly. This facilitates identification of exon linkages and nucleotide modifications; however, using conventional direct RNA nanopore sequencing, the 5' and 3' ends of poly(A) RNA cannot be identified unambiguously. This is due in part to RNA degradation in vivo and in vitro that can obscure transcription start and end sites. In this study, we aimed to identify individual full-length human RNA isoforms among ∼4 million nanopore poly(A)-selected RNA reads. First, to identify RNA strands bearing 5' mG caps, we exchanged the biological cap for a modified cap attached to a 45-nt oligomer. This oligomer adaptation method improved 5' end sequencing and ensured correct identification of the 5' mG capped ends. Second, among these 5'-capped nanopore reads, we screened for features consistent with a 3' polyadenylation site. Combining these two steps, we identified 294,107 individual high-confidence full-length RNA scaffolds from human GM12878 cells, most of which (257,721) aligned to protein-coding genes. Of these, 4876 scaffolds indicated unannotated isoforms that were often internal to longer, previously identified RNA isoforms. Orthogonal data for mG caps and open chromatin, such as CAGE and DNase-HS seq, confirmed the validity of these high-confidence RNA scaffolds.
纳米孔测序设备可直接读取单个 RNA 链。这有助于鉴定外显子连接和核苷酸修饰;然而,使用传统的直接 RNA 纳米孔测序,无法明确鉴定聚 A RNA 的 5' 和 3' 末端。这部分是由于体内和体外的 RNA 降解,这可能会掩盖转录起始和结束位点。在这项研究中,我们旨在从约 400 万个纳米孔聚 A 选择的 RNA 读取中鉴定单个全长人类 RNA 异构体。首先,为了鉴定带有 5' mG 帽的 RNA 链,我们将生物帽替换为附着在 45-nt 寡核苷酸上的修饰帽。这种寡核苷酸适应方法改善了 5' 端测序,并确保正确鉴定 5' mG 帽端。其次,在这些 5'-加帽的纳米孔读取中,我们筛选出与 3' 多聚腺苷酸化位点一致的特征。结合这两个步骤,我们从人类 GM12878 细胞中鉴定了 294,107 个高可信度全长 RNA 支架,其中大多数(257,721)与编码蛋白的基因对齐。其中,4876 个支架表明存在未注释的异构体,这些异构体通常位于先前鉴定的 RNA 异构体的内部。mG 帽和开放染色质的正交数据,如 CAGE 和 DNase-HS seq,证实了这些高可信度 RNA 支架的有效性。