[宏基因组学研究的生物信息学流程综述]

[A review on the bioinformatics pipelines for metagenomic research].

作者信息

Ye Dan-Dan, Fan Meng-Meng, Guan Qiong, Chen Hong-Ju, Ma Zhan-Shan

机构信息

Kunming Institute of Zoology, Chinese Academy of Sciences, China.

出版信息

Dongwuxue Yanjiu. 2012 Dec;33(6):574-85. doi: 10.3724/SP.J.1141.2012.06574.

Abstract

Metagenome, a term first dubbed by Handelsman in 1998 as "the genomes of the total microbiota found in nature", refers to sequence data directly sampled from the environment (which may be any habitat in which microbes live, such as the guts of humans and animals, milk, soil, lakes, glaciers, and oceans). Metagenomic technologies originated from environmental microbiology studies and their wide application has been greatly facilitated by next-generation high throughput sequencing technologies. Like genomics studies, the bottle neck of metagenomic research is how to effectively and efficiently analyze the gigantic amount of metagenomic sequence data using the bioinformatics pipelines to obtain meaningful biological insights. In this article, we briefly review the state-of-the-art bioinformatics software tools in metagenomic research. Due to the differences between the metagenomic data obtained from whole genome sequencing (i.e., shotgun metagenomics) and amplicon sequencing (i.e., 16S-rRNA and gene-targeted metagenomics) methods, there are significant differences between the corresponding bioinformatics tools for these data; accordingly, we review the computational pipelines separately for these two types of data.

摘要

宏基因组,这个术语于1998年由汉德尔斯曼首次称为“自然界中发现的全部微生物群落的基因组”,指的是直接从环境中采样的序列数据(环境可以是微生物生存的任何栖息地,如人类和动物的肠道、牛奶、土壤、湖泊、冰川和海洋)。宏基因组技术起源于环境微生物学研究,其广泛应用在很大程度上得益于下一代高通量测序技术。与基因组学研究一样,宏基因组研究的瓶颈在于如何利用生物信息学流程有效且高效地分析海量的宏基因组序列数据,以获得有意义的生物学见解。在本文中,我们简要回顾宏基因组研究中最先进的生物信息学软件工具。由于从全基因组测序(即鸟枪法宏基因组学)和扩增子测序(即16S - rRNA和基因靶向宏基因组学)方法获得的宏基因组数据存在差异,针对这些数据的相应生物信息学工具也有显著差异;因此,我们分别针对这两类数据回顾计算流程。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索