The National Laboratory for Scientific Computing LNCC, Getúlio Vargas, Petrópolis, Rio de Janeiro, Brazil.
BMC Microbiol. 2012 Aug 9;12:172. doi: 10.1186/1471-2180-12-172.
The type IV secretion system (T4SS) can be classified as a large family of macromolecule transporter systems, divided into three recognized sub-families, according to the well-known functions. The major sub-family is the conjugation system, which allows transfer of genetic material, such as a nucleoprotein, via cell contact among bacteria. Also, the conjugation system can transfer genetic material from bacteria to eukaryotic cells; such is the case with the T-DNA transfer of Agrobacterium tumefaciens to host plant cells. The system of effector protein transport constitutes the second sub-family, and the third one corresponds to the DNA uptake/release system. Genome analyses have revealed numerous T4SS in Bacteria and Archaea. The purpose of this work was to organize, classify, and integrate the T4SS data into a single database, called AtlasT4SS - the first public database devoted exclusively to this prokaryotic secretion system.
The AtlasT4SS is a manual curated database that describes a large number of proteins related to the type IV secretion system reported so far in Gram-negative and Gram-positive bacteria, as well as in Archaea. The database was created using the RDBMS MySQL and the Catalyst Framework based in the Perl programming language and using the Model-View-Controller (MVC) design pattern for Web. The current version holds a comprehensive collection of 1,617 T4SS proteins from 58 Bacteria (49 Gram-negative and 9 Gram-Positive), one Archaea and 11 plasmids. By applying the bi-directional best hit (BBH) relationship in pairwise genome comparison, it was possible to obtain a core set of 134 clusters of orthologous genes encoding T4SS proteins.
In our database we present one way of classifying orthologous groups of T4SSs in a hierarchical classification scheme with three levels. The first level comprises four classes that are based on the organization of genetic determinants, shared homologies, and evolutionary relationships: (i) F-T4SS, (ii) P-T4SS, (iii) I-T4SS, and (iv) GI-T4SS. The second level designates a specific well-known protein families otherwise an uncharacterized protein family. Finally, in the third level, each protein of an ortholog cluster is classified according to its involvement in a specific cellular process. AtlasT4SS database is open access and is available at http://www.t4ss.lncc.br.
IV 型分泌系统(T4SS)可被归类为大分子转运系统大家族,根据其已知功能可分为三个公认的亚家族。主要的亚家族是接合系统,该系统允许通过细菌之间的细胞接触转移遗传物质,例如核蛋白。此外,接合系统可以将遗传物质从细菌转移到真核细胞;这种情况发生在根瘤农杆菌向宿主植物细胞转移 T-DNA 时。效应蛋白转运系统构成了第二个亚家族,第三个亚家族对应于 DNA 摄取/释放系统。基因组分析揭示了细菌和古菌中存在大量的 T4SS。本研究的目的是将 T4SS 数据组织、分类并整合到一个单一的数据库中,称为 AtlasT4SS——这是第一个专门用于该原核分泌系统的公共数据库。
AtlasT4SS 是一个手动整理的数据库,描述了迄今为止在革兰氏阴性和革兰氏阳性细菌以及古菌中报道的大量与 IV 型分泌系统相关的蛋白质。该数据库是使用 RDBMS MySQL 和基于 Perl 编程语言的 Catalyst 框架创建的,并使用模型-视图-控制器 (MVC) 设计模式进行 Web 开发。当前版本包含来自 58 种细菌(49 种革兰氏阴性和 9 种革兰氏阳性)、1 种古菌和 11 种质粒的 1617 种 T4SS 蛋白的综合数据集。通过在成对基因组比较中应用双向最佳匹配(BBH)关系,我们能够获得一组编码 T4SS 蛋白的同源基因簇的核心集。
在我们的数据库中,我们提出了一种在层次分类方案中对 T4SS 同源基因簇进行分类的方法,该方案分为三个层次。第一级包括基于遗传决定因素的组织、共享同源性和进化关系的四个类:(i) F-T4SS、(ii) P-T4SS、(iii) I-T4SS 和 (iv) GI-T4SS。第二级指定特定的已知蛋白家族或未表征的蛋白家族。最后,在第三级,每个同源簇的蛋白根据其参与特定细胞过程进行分类。AtlasT4SS 数据库是开放获取的,可在 http://www.t4ss.lncc.br 上获得。