Ferguson Lucas, Upton Heather E, Pimentel Sydney C, Jeans Chris, Ingolia Nicholas T, Collins Kathleen
Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, USA.
Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA.
bioRxiv. 2024 Nov 10:2024.11.09.622813. doi: 10.1101/2024.11.09.622813.
Sequencing RNAs that are biologically processed or degraded to less than ~100 nucleotides typically involves multi-step, low-yield protocols with bias and information loss inherent to ligation and/or polynucleotide tailing. We recently introduced Ordered Two-Template Relay (OTTR), a method that captures obligatorily end-to-end sequences of input molecules and, in the same reverse transcription step, also appends 5' and 3' sequencing adapters of choice. OTTR has been thoroughly benchmarked for optimal production of microRNA, tRNA and tRNA fragments, and ribosome-protected mRNA footprint libraries. Here we sought to characterize, quantify, and ameliorate any remaining bias or imprecision in the end-to-end capture of RNA sequences. We introduce new metrics for the evaluation of sequence capture and use them to optimize reaction buffers, reverse transcriptase sequence, adapter oligonucleotides, and overall workflow. Modifications of the reverse transcriptase and adapter oligonucleotides increased the 3' and 5' end-precision of sequence capture and minimized overall library bias. Improvements in recombinant expression and purification of the truncated R2 reverse transcriptase used in OTTR reduced non-productive sequencing reads by minimizing bacterial nucleic acids that compete with low-input RNA molecules for cDNA synthesis, such that with miRNA input of 3 picograms (less than 1 fmol), fewer than 10% of sequencing reads are bacterial nucleic acid contaminants. We also introduce a rapid, automation-compatible OTTR protocol that enables gel-free, length-agnostic enrichment of cDNA duplexes from unwanted adapter-only side products. Overall, this work informs considerations for unbiased end-to-end capture and annotation of RNAs independent of their sequence, structure, or post-transcriptional modifications.
对经过生物加工或降解至少于约100个核苷酸的RNA进行测序,通常涉及多步骤、低产量的方案,这些方案存在连接和/或多核苷酸加尾固有的偏差和信息损失。我们最近引入了有序双模板中继(OTTR)方法,该方法可强制捕获输入分子的端到端序列,并在同一逆转录步骤中,还添加所选的5'和3'测序接头。OTTR已针对微小RNA、转运RNA和转运RNA片段以及核糖体保护的mRNA足迹文库的最佳生产进行了全面的基准测试。在这里,我们试图表征、量化和改善RNA序列端到端捕获中任何残留的偏差或不精确性。我们引入了用于评估序列捕获的新指标,并使用它们来优化反应缓冲液、逆转录酶序列、接头寡核苷酸和整体工作流程。逆转录酶和接头寡核苷酸的修饰提高了序列捕获的3'和5'末端精确性,并使整体文库偏差最小化。OTTR中使用的截短R2逆转录酶的重组表达和纯化的改进,通过最小化与低输入RNA分子竞争cDNA合成的细菌核酸,减少了非生产性测序读数,使得在输入3皮克(小于1飞摩尔)的微小RNA时,少于10%的测序读数是细菌核酸污染物。我们还引入了一种快速、与自动化兼容的OTTR方案,该方案能够从不需要的仅含接头的副产物中进行无凝胶、与长度无关的cDNA双链体富集。总体而言,这项工作为无偏差地端到端捕获和注释RNA提供了考虑因素,而与它们的序列、结构或转录后修饰无关。