Suppr超能文献

CAAT盒,用于基因组测序项目的重叠群组装与注释工具盒。

CAAT-Box, Contigs-Assembly and Annotation Tool-Box for genome sequencing projects.

作者信息

Frangeul L, Glaser P, Rusniok C, Buchrieser C, Duchaud E, Dehoux P, Kunst F

机构信息

Génopole, Institut Pasteur, 28 rue du Dr Roux, 75724 Paris 15, France.

出版信息

Bioinformatics. 2004 Mar 22;20(5):790-7. doi: 10.1093/bioinformatics/btg490. Epub 2004 Jan 29.

Abstract

MOTIVATION

Contigs-Assembly and Annotation Tool-Box (CAAT-Box) is a software package developed for the computational part of a genome project where the sequence is obtained by a shotgun strategy. CAAT-Box contains new tools to predict links between contigs by using similarity searches with other whole genome sequences. Most importantly, it allows annotation of a genome to commence during the finishing phase using a gene-oriented strategy. For this purpose, CAAT-Box creates an Individual Protein file (IPF) for each ORF of an assembly. The nucleotide sequence reported in an IPF corresponds to the sequence of the ORF with 500 additional bases before the ORF and 200 bases after. For annotation, additional information like Blast results can be added or linked to the IPFs as well as automatic and/or manual annotations. When a new assembly is performed, CAAT-Box creates new IPFs according to the old IPF panel. CAAT-Box recognizes the modified IPFs which are the only ones used for a new automatic analysis after each assembly. Using this strategy, the user works with a group of IPFs independently of the closure phase progression. The IPFs are accessible by a web server and can therefore be modified and commented by different groups.

RESULT

CAAT-Box was used to obtain and to annotate several complete genomes like Listeria monocytogenes or Streptococcus agalactiae.

AVAILABILITY

The program may be obtained from the authors and is freely available to non-profit organisations.

摘要

动机

重叠群组装与注释工具箱(CAAT-Box)是一个为基因组计划的计算部分开发的软件包,该基因组计划通过鸟枪法策略获取序列。CAAT-Box包含新工具,可通过与其他全基因组序列进行相似性搜索来预测重叠群之间的联系。最重要的是,它允许在完成阶段使用面向基因的策略开始对基因组进行注释。为此,CAAT-Box为组装的每个开放阅读框(ORF)创建一个个体蛋白质文件(IPF)。IPF中报告的核苷酸序列对应于ORF的序列,在ORF之前有500个额外碱基,之后有200个碱基。对于注释,可以将诸如Blast结果等附加信息添加或链接到IPF,以及自动和/或手动注释。当进行新的组装时,CAAT-Box会根据旧的IPF面板创建新的IPF。CAAT-Box识别修改后的IPF,这些是每次组装后用于新的自动分析的唯一IPF。使用这种策略,用户可以独立于封闭阶段的进展使用一组IPF。IPF可通过网络服务器访问,因此不同的组可以对其进行修改和注释。

结果

CAAT-Box被用于获取和注释几个完整的基因组,如单核细胞增生李斯特菌或无乳链球菌。

可用性

该程序可从作者处获得,非营利组织可免费使用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验