Smith Paul E, Waters Sinead M, Gómez Expósito Ruth, Smidt Hauke, Carberry Ciara A, McCabe Matthew S
Teagasc Animal and Bioscience Research Department, Teagasc Grange, Meath, Ireland.
UCD School of Agricultural and Food Science, University College Dublin, Dublin, Ireland.
Front Microbiol. 2020 Dec 8;11:606825. doi: 10.3389/fmicb.2020.606825. eCollection 2020.
Our understanding of complex microbial communities, such as those residing in the rumen, has drastically advanced through the use of high throughput sequencing (HTS) technologies. Indeed, with the use of barcoded amplicon sequencing, it is now cost effective and computationally feasible to identify individual rumen microbial genera associated with ruminant livestock nutrition, genetics, performance and greenhouse gas production. However, across all disciplines of microbial ecology, there is currently little reporting of the use of internal controls for validating HTS results. Furthermore, there is little consensus of the most appropriate reference database for analyzing rumen microbiota amplicon sequencing data. Therefore, in this study, a synthetic rumen-specific sequencing standard was used to assess the effects of database choice on results obtained from rumen microbial amplicon sequencing. Four DADA2 reference training sets (RDP, SILVA, GTDB, and RefSeq + RDP) were compared to assess their ability to correctly classify sequences included in the rumen-specific sequencing standard. In addition, two thresholds of phylogenetic bootstrapping, 50 and 80, were applied to investigate the effect of increasing stringency. Sequence classification differences were apparent amongst the databases. For example the classification of differed between all databases, thus highlighting the need for a consistent approach to nomenclature amongst different reference databases. It is hoped the effect of database on taxonomic classification observed in this study, will encourage research groups across various microbial disciplines to develop and routinely use their own microbiome-specific reference standard to validate analysis pipelines and database choice.
我们对复杂微生物群落的理解,例如瘤胃中的微生物群落,通过使用高通量测序(HTS)技术有了极大的进展。事实上,利用条形码扩增子测序,现在识别与反刍家畜营养、遗传学、生产性能和温室气体排放相关的单个瘤胃微生物属在成本上是可行的,在计算上也是可行的。然而,在微生物生态学的所有学科中,目前很少有关于使用内部控制来验证HTS结果的报道。此外,对于分析瘤胃微生物群扩增子测序数据的最合适参考数据库,也几乎没有达成共识。因此,在本研究中,使用了一种合成的瘤胃特异性测序标准来评估数据库选择对瘤胃微生物扩增子测序结果的影响。比较了四个DADA2参考训练集(RDP、SILVA、GTDB和RefSeq + RDP),以评估它们正确分类瘤胃特异性测序标准中包含的序列的能力。此外,应用了两种系统发育自展阈值,即50和80,以研究增加严格性的影响。数据库之间的序列分类差异很明显。例如,所有数据库之间 的分类都不同,从而突出了在不同参考数据库之间采用一致命名方法的必要性。希望本研究中观察到的数据库对分类学分类的影响,将鼓励各个微生物学科的研究小组开发并常规使用他们自己的微生物组特异性参考标准,以验证分析流程和数据库选择。