Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, SE-75123, Uppsala, Sweden.
Science for Life Laboratory, Uppsala University, SE-75185, Uppsala, Sweden.
Environ Microbiol. 2019 Jul;21(7):2485-2498. doi: 10.1111/1462-2920.14636. Epub 2019 May 7.
Amplicon sequencing of the 16S rRNA gene is the predominant method to quantify microbial compositions and to discover novel lineages. However, traditional short amplicons often do not contain enough information to confidently resolve their phylogeny. Here we present a cost-effective protocol that amplifies a large part of the rRNA operon and sequences the amplicons with PacBio technology. We tested our method on a mock community and developed a read-curation pipeline that reduces the overall read error rate to 0.18%. Applying our method on four environmental samples, we captured near full-length rRNA operon amplicons from a large diversity of prokaryotes. The method operated at moderately high-throughput (22286-37,850 raw ccs reads) and generated a large amount of putative novel archaeal 23S rRNA gene sequences compared to the archaeal SILVA database. These long amplicons allowed for higher resolution during taxonomic classification by means of long (∼1000 bp) 16S rRNA gene fragments and for substantially more confident phylogenies by means of combined near full-length 16S and 23S rRNA gene sequences, compared to shorter traditional amplicons (250 bp of the 16S rRNA gene). We recommend our method to those who wish to cost-effectively and confidently estimate the phylogenetic diversity of prokaryotes in environmental samples at high throughput.
扩增子测序是定量微生物组成和发现新谱系的主要方法。然而,传统的短扩增子通常不能包含足够的信息来自信地解析其系统发育。在这里,我们提出了一种经济有效的方案,该方案可以扩增 rRNA 操纵子的大部分,并使用 PacBio 技术对扩增子进行测序。我们在模拟群落上测试了我们的方法,并开发了一种读取校正管道,将总体读取错误率降低到 0.18%。将我们的方法应用于四个环境样本,我们从大量的原核生物中捕获了接近全长的 rRNA 操纵子扩增子。该方法以中等高通量(22286-37850 个原始 ccs 读取)运行,并生成了大量与古菌 SILVA 数据库相比的假定新古菌 23S rRNA 基因序列。与较短的传统扩增子(16S rRNA 基因的 250bp)相比,这些长扩增子允许通过较长的(约 1000bp)16S rRNA 基因片段进行分类学分类,分辨率更高,并通过组合的近全长 16S 和 23S rRNA 基因序列,进行更有信心的系统发育分析。我们建议那些希望以经济有效的方式并具有信心地在高通量环境样本中估计原核生物系统发育多样性的人使用我们的方法。