Kiraga Joanna, Mackiewicz Pawel, Mackiewicz Dorota, Kowalczuk Maria, Biecek Przemysław, Polak Natalia, Smolarczyk Kamila, Dudek Miroslaw R, Cebrat Stanislaw
Department of Genomics, University of Wrocław, Wrocław, Poland.
BMC Genomics. 2007 Jun 12;8:163. doi: 10.1186/1471-2164-8-163.
The distribution of isoelectric point (pI) of proteins in a proteome is universal for all organisms. It is bimodal dividing the proteome into two sets of acidic and basic proteins. Different species however have different abundance of acidic and basic proteins that may be correlated with taxonomy, subcellular localization, ecological niche of organisms and proteome size.
We have analysed 1784 proteomes encoded by chromosomes of Archaea, Bacteria, Eukaryota, and also mitochondria, plastids, prokaryotic plasmids, phages and viruses. We have found significant correlation in more than 95% of proteomes between the protein length and pI in proteomes--positive for acidic proteins and negative for the basic ones. Plastids, viruses and plasmids encode more basic proteomes while chromosomes of Archaea, Bacteria, Eukaryota, mitochondria and phages more acidic ones. Mitochondrial proteomes of Viridiplantae, Protista and Fungi are more basic than Metazoa. It results from the presence of basic proteins in the former proteomes and their absence from the latter ones and is related with reduction of metazoan genomes. Significant correlation was found between the pI bias of proteomes encoded by prokaryotic chromosomes and proteomes encoded by plasmids but there is no correlation between eukaryotic nuclear-coded proteomes and proteomes encoded by organelles. Detailed analyses of prokaryotic proteomes showed significant relationships between pI distribution and habitat, relation to the host cell and salinity of the environment, but no significant correlation with oxygen and temperature requirements. The salinity is positively correlated with acidicity of proteomes. Host-associated organisms and especially intracellular species have more basic proteomes than free-living ones. The higher rate of mutations accumulation in the intracellular parasites and endosymbionts is responsible for the basicity of their tiny proteomes that explains the observed positive correlation between the decrease of genome size and the increase of basicity of proteomes. The results indicate that even conserved proteins subjected to strong selectional constraints follow the global trend in the pI distribution.
The distribution of pI of proteins in proteomes shows clear relationships with length of proteins, subcellular localization, taxonomy and ecology of organisms. The distribution is also strongly affected by mutational pressure especially in intracellular organisms.
蛋白质组中蛋白质等电点(pI)的分布在所有生物中都具有普遍性。它呈双峰分布,将蛋白质组分为酸性和碱性两组蛋白质。然而,不同物种的酸性和碱性蛋白质丰度不同,这可能与生物分类学、亚细胞定位、生物的生态位和蛋白质组大小相关。
我们分析了古菌、细菌、真核生物的染色体以及线粒体、质体、原核质粒、噬菌体和病毒所编码的1784个蛋白质组。我们发现在超过95%的蛋白质组中,蛋白质长度与蛋白质组中的pI之间存在显著相关性——酸性蛋白质呈正相关,碱性蛋白质呈负相关。质体、病毒和质粒编码的蛋白质组碱性更强,而古菌、细菌、真核生物、线粒体和噬菌体的染色体编码的蛋白质组酸性更强。绿藻、原生生物和真菌的线粒体蛋白质组比后生动物的更碱性。这是由于前一组蛋白质组中存在碱性蛋白质,而后一组中不存在,这与后生动物基因组的减少有关。发现原核染色体编码的蛋白质组的pI偏差与质粒编码的蛋白质组之间存在显著相关性,但真核细胞核编码的蛋白质组与细胞器编码的蛋白质组之间没有相关性。对原核蛋白质组的详细分析表明,pI分布与栖息地、与宿主细胞的关系以及环境盐度之间存在显著关系,但与氧气和温度需求没有显著相关性。盐度与蛋白质组的酸度呈正相关。与宿主相关的生物,尤其是细胞内物种,比自由生活的生物具有更碱性的蛋白质组。细胞内寄生虫和内共生体中较高的突变积累率导致了它们微小蛋白质组的碱性,这解释了观察到的基因组大小减少与蛋白质组碱性增加之间的正相关。结果表明,即使是受到强烈选择限制的保守蛋白质也遵循pI分布的全局趋势。
蛋白质组中蛋白质pI的分布与蛋白质长度、亚细胞定位、生物的分类学和生态学显示出明确的关系。这种分布也受到突变压力的强烈影响,尤其是在细胞内生物中。