McKinlay Anastasia, Fultz Dalen, Wang Feng, Pikaard Craig S
Department of Biology and Department of Molecular and Cellular Biochemistry, Indiana University, Bloomington, IN, United States.
Howard Hughes Medical Institute, Indiana University, Bloomington, IN, United States.
Front Plant Sci. 2021 Apr 28;12:656049. doi: 10.3389/fpls.2021.656049. eCollection 2021.
Large regions of nearly identical repeats, such as the 45S ribosomal RNA (rRNA) genes of Nucleolus Organizer Regions (NORs), can account for major gaps in sequenced genomes. To assemble these regions, ultra-long sequencing reads that span multiple repeats have the potential to reveal sets of repeats that collectively have sufficient sequence variation to unambiguously define that interval and recognize overlapping reads. Because individual repetitive loci typically represent a small proportion of the genome, methods to enrich for the regions of interest are desirable. Here we describe a simple method that achieves greater than tenfold enrichment of 45S rRNA gene sequences among ultra-long Oxford Nanopore Technology sequencing reads. This method employs agarose-embedded genomic DNA that is subjected to restriction endonucleases digestion using a cocktail of enzymes predicted to be non-cutters of rRNA genes. Most of the genome is digested into small fragments that diffuse out of the agar plugs, whereas rRNA gene arrays are retained. In principle, the approach can also be adapted for sequencing other repetitive loci for which gaps exist in a reference genome.
大片几乎相同的重复区域,如核仁组织区(NORs)的45S核糖体RNA(rRNA)基因,可导致测序基因组中出现主要缺口。为了组装这些区域,跨越多个重复序列的超长测序读数有可能揭示出一组重复序列,这些重复序列共同具有足够的序列变异,从而能够明确界定该区间并识别重叠读数。由于单个重复位点通常只占基因组的一小部分,因此需要采用富集感兴趣区域的方法。在此,我们描述了一种简单的方法,该方法能在超长牛津纳米孔技术测序读数中实现45S rRNA基因序列超过十倍的富集。该方法采用嵌入琼脂糖的基因组DNA,使用预计不会切割rRNA基因的酶混合物进行限制性内切酶消化。基因组的大部分被消化成小片段,从小琼脂块中扩散出来,而rRNA基因阵列则被保留下来。原则上,该方法也可适用于对参考基因组中存在缺口的其他重复位点进行测序。