Suppr超能文献

PANGEA:下一代扩增子分析的管道。

PANGEA: pipeline for analysis of next generation amplicons.

机构信息

Department of Microbiology and Cell Science, University of Florida, Gainesville, FL 32611-0700, USA.

出版信息

ISME J. 2010 Jul;4(7):852-61. doi: 10.1038/ismej.2010.16. Epub 2010 Feb 25.

Abstract

High-throughput DNA sequencing can identify organisms and describe population structures in many environmental and clinical samples. Current technologies generate millions of reads in a single run, requiring extensive computational strategies to organize, analyze and interpret those sequences. A series of bioinformatics tools for high-throughput sequencing analysis, including pre-processing, clustering, database matching and classification, have been compiled into a pipeline called PANGEA. The PANGEA pipeline was written in Perl and can be run on Mac OSX, Windows or Linux. With PANGEA, sequences obtained directly from the sequencer can be processed quickly to provide the files needed for sequence identification by BLAST and for comparison of microbial communities. Two different sets of bacterial 16S rRNA sequences were used to show the efficiency of this workflow. The first set of 16S rRNA sequences is derived from various soils from Hawaii Volcanoes National Park. The second set is derived from stool samples collected from diabetes-resistant and diabetes-prone rats. The workflow described here allows the investigator to quickly assess libraries of sequences on personal computers with customized databases. PANGEA is provided for users as individual scripts for each step in the process or as a single script where all processes, except the chi(2) step, are joined into one program called the 'backbone'.

摘要

高通量 DNA 测序可以识别许多环境和临床样本中的生物,并描述其种群结构。目前的技术在单次运行中可以产生数百万个读数,因此需要广泛的计算策略来组织、分析和解释这些序列。一系列用于高通量测序分析的生物信息学工具,包括预处理、聚类、数据库匹配和分类,已被编译成一个名为 PANGEA 的管道。PANGEA 管道是用 Perl 编写的,可以在 Mac OSX、Windows 或 Linux 上运行。使用 PANGEA,可以快速处理直接从测序仪获得的序列,提供 BLAST 所需的文件用于序列识别,并比较微生物群落。使用两组不同的细菌 16S rRNA 序列来展示该工作流程的效率。第一组 16S rRNA 序列来自夏威夷火山国家公园的各种土壤。第二组序列来自从糖尿病抗性和糖尿病易感大鼠收集的粪便样本。这里描述的工作流程允许研究人员在具有自定义数据库的个人计算机上快速评估序列库。PANGEA 作为单个脚本提供给用户,用于处理过程中的每个步骤,或者作为一个单独的脚本,其中除了 chi(2)步骤之外的所有步骤都被合并到一个名为“骨干”的程序中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b8a/2974434/6decf367c360/nihms238158f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验