Sharifi Fatemeh, Ye Yuzhen
School of Informatics and Computing, Indiana University, 150 S. Woodlawn Ave., Bloomington, IN, 47405, USA.
Methods Mol Biol. 2017;1611:27-34. doi: 10.1007/978-1-4939-7015-5_3.
Microbes play important roles in almost every aspect of life, including human health and diseases. Facilitated by the rapid development of sequencing technologies, metagenomics research has accelerated the accumulation of genomic sequences of microbial species that had been inaccessible before. Analysis of the metagenomic sequencing data can reveal not only the species but also the functional composition of microbial communities. Here, we report a pipeline for functional annotation of metagenomic datasets. The pipeline is built from several programs that we have developed for metagenomic sequence analysis including a protein-coding gene predictor for short reads (or contigs) and a fast similarity search tool. Given a metagenomic dataset, the pipeline reports putative protein-coding genes (or gene fragments) and functional annotations of the genes in Gene Ontology (GO) terms and Enzyme Commission (EC) numbers, and potential metabolic pathways that are likely encoded by the metagenome. Fun4Me is available for download at https://sourceforge.net/projects/fun4me .
微生物在生命的几乎每个方面都发挥着重要作用,包括人类健康和疾病。在测序技术快速发展的推动下,宏基因组学研究加速了以前无法获取的微生物物种基因组序列的积累。对宏基因组测序数据的分析不仅可以揭示微生物群落的物种组成,还能揭示其功能组成。在此,我们报告了一种用于宏基因组数据集功能注释的流程。该流程由我们为宏基因组序列分析开发的几个程序构建而成,包括一个针对短读长(或重叠群)的蛋白质编码基因预测器和一个快速相似性搜索工具。给定一个宏基因组数据集,该流程会报告推定的蛋白质编码基因(或基因片段)及其在基因本体论(GO)术语和酶委员会(EC)编号中的功能注释,以及宏基因组可能编码的潜在代谢途径。Fun4Me可在https://sourceforge.net/projects/fun4me上下载。