利用单 16S rRNA 基因 V 区结合数百万读长生成测序技术进行土壤细菌多样性筛选。
Soil bacterial diversity screening using single 16S rRNA gene V regions coupled with multi-million read generating sequencing technologies.
机构信息
Università Cattolica del Sacro Cuore, Faculty of Agricultural Sciences, Institute of Agricultural and Environmental Chemistry, Piacenza, Italy.
出版信息
PLoS One. 2012;7(8):e42671. doi: 10.1371/journal.pone.0042671. Epub 2012 Aug 6.
The novel multi-million read generating sequencing technologies are very promising for resolving the immense soil 16S rRNA gene bacterial diversity. Yet they have a limited maximum sequence length screening ability, restricting studies in screening DNA stretches of single 16S rRNA gene hypervariable (V) regions. The aim of the present study was to assess the effects of properties of four consecutive V regions (V3-6) on commonly applied analytical methodologies in bacterial ecology studies. Using an in silico approach, the performance of each V region was compared with the complete 16S rRNA gene stretch. We assessed related properties of the soil derived bacterial sequence collection of the Ribosomal Database Project (RDP) database and concomitantly performed simulations based on published datasets. Results indicate that overall the most prominent V region for soil bacterial diversity studies was V3, even though it was outperformed in some of the tests. Despite its high performance during most tests, V4 was less conserved along flanking sites, thus reducing its ability for bacterial diversity coverage. V5 performed well in the non-redundant RDP database based analysis. However V5 did not resemble the full-length 16S rRNA gene sequence results as well as V3 and V4 did when the natural sequence frequency and occurrence approximation was considered in the virtual experiment. Although, the highly conserved flanking sequence regions of V6 provide the ability to amplify partial 16S rRNA gene sequences from very diverse owners, it was demonstrated that V6 was the least informative compared to the rest examined V regions. Our results indicate that environment specific database exploration and theoretical assessment of the experimental approach are strongly suggested in 16S rRNA gene based bacterial diversity studies.
新型多百万读生成测序技术对于解决巨大的土壤 16S rRNA 基因细菌多样性非常有前景。然而,它们具有有限的最大序列长度筛选能力,限制了对单个 16S rRNA 基因超变 (V) 区 DNA 片段的研究。本研究旨在评估四个连续 V 区 (V3-6) 的特性对细菌生态学研究中常用分析方法的影响。通过计算机模拟方法,比较了每个 V 区与完整 16S rRNA 基因序列的性能。我们评估了核糖体数据库项目 (RDP) 数据库中土壤衍生细菌序列集的相关特性,并同时基于已发表的数据集进行了模拟。结果表明,总体而言,V3 是土壤细菌多样性研究中最突出的 V 区,尽管在某些测试中它的性能不如其他 V 区。尽管在大多数测试中性能较高,但 V4 沿侧翼位点的保守性较低,从而降低了其对细菌多样性的覆盖能力。V5 在非冗余 RDP 数据库的基于分析的测试中表现良好。然而,当在虚拟实验中考虑自然序列频率和出现近似值时,V5 与全长 16S rRNA 基因序列结果的相似性不如 V3 和 V4。尽管 V6 高度保守的侧翼序列区域提供了从非常多样化的宿主中扩增部分 16S rRNA 基因序列的能力,但与其他检查的 V 区相比,它被证明是信息量最少的。我们的结果表明,在基于 16S rRNA 基因的细菌多样性研究中,强烈建议对环境特定数据库进行探索和对实验方法进行理论评估。