Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warsaw, Poland.
PLoS One. 2013;8(1):e54471. doi: 10.1371/journal.pone.0054471. Epub 2013 Jan 21.
The minimal set of proteins necessary to maintain a vertebrate cell forms an interesting core of cellular machinery. The known proteome of human red blood cell consists of about 1400 proteins. We treated this protein complement of one of the simplest human cells as a model and asked the questions on its function and origins. The proteome was mapped onto phylogenetic profiles, i.e. vectors of species possessing homologues of human proteins. A novel clustering approach was devised, utilising similarity in the phylogenetic spread of homologues as distance measure. The clustering based on phylogenetic profiles yielded several distinct protein classes differing in phylogenetic taxonomic spread, presumed evolutionary history and functional properties. Notably, small clusters of proteins common to vertebrates or Metazoa and other multicellular eukaryotes involve biological functions specific to multicellular organisms, such as apoptosis or cell-cell signaling, respectively. Also, a eukaryote-specific cluster is identified, featuring GTP-ase signalling and ubiquitination. Another cluster, made up of proteins found in most organisms, including bacteria and archaea, involves basic molecular functions such as oxidation-reduction and glycolysis. Approximately one third of erythrocyte proteins do not fall in any of the clusters, reflecting the complexity of protein evolution in comparison to our simple model. Basically, the clustering obtained divides the proteome into old and new parts, the former originating from bacterial ancestors, the latter from inventions within multicellular eukaryotes. Thus, the model human cell proteome appears to be made up of protein sets distinct in their history and biological roles. The current work shows that phylogenetic profiles concept allows protein clustering in a way relevant both to biological function and evolutionary history.
维持脊椎动物细胞所必需的最小蛋白质集合形成了有趣的细胞机制核心。已知的人类红细胞蛋白质组由大约 1400 种蛋白质组成。我们将这种最简单的人类细胞之一的蛋白质组成作为模型,并询问其功能和起源问题。蛋白质组被映射到系统发育轮廓上,即具有人类蛋白质同源物的物种的向量。设计了一种新颖的聚类方法,利用同源物系统发育分布的相似性作为距离度量。基于系统发育轮廓的聚类产生了几个不同的蛋白质类群,它们在系统发育分类学分布、假定的进化历史和功能特性上存在差异。值得注意的是,脊椎动物或后生动物和其他多细胞真核生物共有的小蛋白质簇涉及多细胞生物特有的生物学功能,例如凋亡或细胞间信号传递。此外,还鉴定出一个真核生物特有的簇,其特征是 GTP 酶信号转导和泛素化。另一个由包括细菌和古菌在内的大多数生物体中发现的蛋白质组成的簇涉及基本的分子功能,如氧化还原和糖酵解。大约三分之一的红细胞蛋白不属于任何一个簇,这反映了与我们简单模型相比蛋白质进化的复杂性。基本上,所获得的聚类将蛋白质组分为旧部分和新部分,前者来自细菌祖先,后者来自多细胞真核生物的发明。因此,模型人类细胞蛋白质组似乎由在其历史和生物学作用上不同的蛋白质组组成。目前的工作表明,系统发育轮廓的概念允许以与生物学功能和进化历史都相关的方式进行蛋白质聚类。