Department of Systems Biology, Center for Biological Sequence Analysis, Technical University of Denmark Lyngby, Denmark.
Department of Systems Biology, Center for Biological Sequence Analysis, Technical University of Denmark Lyngby, Denmark ; Comparative Genomics Group, Oak Ridge National Laboratory, Biosciences Division Oak Ridge, TN, USA.
Front Microbiol. 2014 Mar 18;5:73. doi: 10.3389/fmicb.2014.00073. eCollection 2014.
We have compared chromosome-specific genes in a set of 18 finished Vibrio genomes, and, in addition, also calculated the pan- and core-genomes from a data set of more than 250 draft Vibrio genome sequences. These genomes come from 9 known species and 2 unknown species. Within the finished chromosomes, we find a core set of 1269 encoded protein families for chromosome 1, and a core of 252 encoded protein families for chromosome 2. Many of these core proteins are also found in the draft genomes (although which chromosome they are located on is unknown.) Of the chromosome specific core protein families, 1169 and 153 are uniquely found in chromosomes 1 and 2, respectively. Gene ontology (GO) terms for each of the protein families were determined, and the different sets for each chromosome were compared. A total of 363 different "Molecular Function" GO categories were found for chromosome 1 specific protein families, and these include several broad activities: pyridoxine 5' phosphate synthetase, glucosylceramidase, heme transport, DNA ligase, amino acid binding, and ribosomal components; in contrast, chromosome 2 specific protein families have only 66 Molecular Function GO terms and include many membrane-associated activities, such as ion channels, transmembrane transporters, and electron transport chain proteins. Thus, it appears that whilst there are many "housekeeping systems" encoded in chromosome 1, there are far fewer core functions found in chromosome 2. However, the presence of many membrane-associated encoded proteins in chromosome 2 is surprising.
我们比较了一组 18 个已完成的弧菌基因组中的染色体特异性基因,此外,还从超过 250 个弧菌基因组草案数据集中计算了泛基因组和核心基因组。这些基因组来自 9 个已知物种和 2 个未知物种。在已完成的染色体中,我们在染色体 1 上发现了 1269 个编码蛋白家族的核心集,在染色体 2 上发现了 252 个编码蛋白家族的核心集。这些核心蛋白中的许多也存在于草案基因组中(尽管它们位于哪个染色体上尚不清楚。)在染色体特异性核心蛋白家族中,1169 和 153 分别仅存在于染色体 1 和 2 上。为每个蛋白家族确定了基因本体 (GO) 术语,并比较了每个染色体的不同集合。染色体 1 特异性蛋白家族共发现 363 种不同的“分子功能”GO 类别,其中包括几个广泛的活动:吡哆醇 5' 磷酸合成酶、葡糖脑苷脂酶、血红素转运、DNA 连接酶、氨基酸结合和核糖体成分;相比之下,染色体 2 特异性蛋白家族只有 66 个分子功能 GO 术语,包括许多膜相关活动,如离子通道、跨膜转运蛋白和电子传递链蛋白。因此,尽管染色体 1 中编码了许多“管家系统”,但在染色体 2 中发现的核心功能却少得多。然而,染色体 2 中存在许多膜相关编码蛋白令人惊讶。