Agroécologie, AgroSup Dijon, INRAE, Univ. Bourgogne, Univ. Bourgogne Franche-Comté, 21000, Dijon, France.
CEA/Institut de Biologie François Jacob/Génoscope, 2, Rue Gaston Crémieux, CP5706, 91057, Evry Cedex, France.
BMC Bioinformatics. 2020 Oct 31;21(1):492. doi: 10.1186/s12859-020-03829-3.
The ability to compare samples or studies easily using metabarcoding so as to better interpret microbial ecology results is an upcoming challenge. A growing number of metabarcoding pipelines are available, each with its own benefits and limitations. However, very few have been developed to offer the opportunity to characterize various microbial communities (e.g., archaea, bacteria, fungi, photosynthetic microeukaryotes) with the same tool.
BIOCOM-PIPE is a flexible and independent suite of tools for processing data from high-throughput sequencing technologies, Roche 454 and Illumina platforms, and focused on the diversity of archaeal, bacterial, fungal, and photosynthetic microeukaryote amplicons. Various original methods were implemented in BIOCOM-PIPE to (1) remove chimeras based on read abundance, (2) align sequences with structure-based alignments of RNA homologs using covariance models, and (3) a post-clustering tool (ReClustOR) to improve OTUs consistency based on a reference OTU database. The comparison with two other pipelines (FROGS and mothur) and Amplicon Sequence Variant definition highlighted that BIOCOM-PIPE was better at discriminating land use groups.
The BIOCOM-PIPE pipeline makes it possible to analyze 16S, 18S and 23S rRNA genes in the same packaged tool. The new post-clustering approach defines a biological database from previously analyzed samples and performs post-clustering of reads with this reference database by using open-reference clustering. This makes it easier to compare projects from various sequencing runs, and increased the congruence among results. For all users, the pipeline was developed to allow for adding or modifying the components, the databases and the bioinformatics tools easily, giving high modularity for each analysis.
使用代谢组学轻松比较样本或研究,从而更好地解释微生物生态学结果,这是一个即将到来的挑战。现已有越来越多的代谢组学分析流程可供使用,每个流程都有其自身的优势和局限性。但是,很少有流程能够开发出来,以便使用相同的工具来描述各种微生物群落(例如古菌、细菌、真菌、光合微真核生物)。
BIOCOM-PIPE 是一个灵活且独立的工具套件,用于处理高通量测序技术(Roche 454 和 Illumina 平台)的数据,主要关注于古菌、细菌、真菌和光合微真核生物扩增子的多样性。BIOCOM-PIPE 中实施了各种原始方法,以(1)根据读取丰度去除嵌合体,(2)使用基于 RNA 同源物结构的比对算法对序列进行比对,(3)使用聚类后处理工具(ReClustOR)基于参考 OTU 数据库来提高 OTU 的一致性。与其他两个流程(FROGS 和 mothur)和扩增子序列变异体定义的比较表明,BIOCOM-PIPE 在区分土地利用类型方面更具优势。
BIOCOM-PIPE 流程使得在同一封装工具中分析 16S、18S 和 23S rRNA 基因成为可能。新的聚类后处理方法从之前分析的样本中定义了一个生物学数据库,并通过使用开放参考聚类,使用该参考数据库对读取进行聚类后处理。这使得比较来自不同测序运行的项目变得更加容易,并提高了结果之间的一致性。对于所有用户,该流程旨在允许轻松添加或修改组件、数据库和生物信息学工具,从而为每个分析提供高度的模块化。