Suppr超能文献

MICCA:一款用于宏基因组数据分类分析的完整且准确的软件。

MICCA: a complete and accurate software for taxonomic profiling of metagenomic data.

作者信息

Albanese Davide, Fontana Paolo, De Filippo Carlotta, Cavalieri Duccio, Donati Claudio

机构信息

Fondazione Edmund Mach, Research and Innovation Centre, Computational Biology Department, Via E. Mach 1, 38010 - S. Michele all'Adige (TN), Italy.

Fondazione Edmund Mach, Research and Innovation Centre, Food Quality Nutrition &Health Department, Via E. Mach 1, 38010 - S. Michele all'Adige (TN), Italy.

出版信息

Sci Rep. 2015 May 19;5:9743. doi: 10.1038/srep09743.

Abstract

The introduction of high throughput sequencing technologies has triggered an increase of the number of studies in which the microbiota of environmental and human samples is characterized through the sequencing of selected marker genes. While experimental protocols have undergone a process of standardization that makes them accessible to a large community of scientist, standard and robust data analysis pipelines are still lacking. Here we introduce MICCA, a software pipeline for the processing of amplicon metagenomic datasets that efficiently combines quality filtering, clustering of Operational Taxonomic Units (OTUs), taxonomy assignment and phylogenetic tree inference. MICCA provides accurate results reaching a good compromise among modularity and usability. Moreover, we introduce a de-novo clustering algorithm specifically designed for the inference of Operational Taxonomic Units (OTUs). Tests on real and synthetic datasets shows that thanks to the optimized reads filtering process and to the new clustering algorithm, MICCA provides estimates of the number of OTUs and of other common ecological indices that are more accurate and robust than currently available pipelines. Analysis of public metagenomic datasets shows that the higher consistency of results improves our understanding of the structure of environmental and human associated microbial communities. MICCA is an open source project.

摘要

高通量测序技术的引入引发了大量研究的增加,在这些研究中,通过对选定标记基因进行测序来表征环境和人类样本中的微生物群。虽然实验方案已经历了标准化过程,使得广大科学家群体都能采用,但仍然缺乏标准且强大的数据分析流程。在此,我们介绍MICCA,这是一个用于处理扩增子宏基因组数据集的软件流程,它能有效地结合质量过滤、操作分类单元(OTU)聚类、分类学归属和系统发育树推断。MICCA能提供准确的结果,在模块化和可用性之间实现了良好的平衡。此外,我们还引入了一种专门为推断操作分类单元(OTU)而设计的从头聚类算法。对真实和合成数据集的测试表明,由于优化的读段过滤过程和新的聚类算法,MICCA提供的OTU数量估计以及其他常见生态指数比目前可用的流程更准确、更可靠。对公共宏基因组数据集的分析表明,结果的更高一致性增进了我们对环境和人类相关微生物群落结构的理解。MICCA是一个开源项目。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ec1/4649890/c5e0faf99589/srep09743-f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验