Department of Chemistry, University of Nebraska-Lincoln, Lincoln, Nebraska.
Holland Computing Center, Office of Research, University of Nebraska-Lincoln, Lincoln, Nebraska.
Proteins. 2019 Jun;87(6):492-501. doi: 10.1002/prot.25670. Epub 2019 Feb 19.
The functional evolution of proteins advances through gene duplication followed by functional drift, whereas molecular evolution occurs through random mutational events. Over time, protein active-site structures or functional epitopes remain highly conserved, which enables relationships to be inferred between distant orthologs or paralogs. In this study, we present the first functional clustering and evolutionary analysis of the RCSB Protein Data Bank (RCSB PDB) based on similarities between active-site structures. All of the ligand-bound proteins within the RCSB PDB were scored using our Comparison of Protein Active-site Structures (CPASS) software and database (http://cpass.unl.edu/). Principal component analysis was then used to identify 4431 representative structures to construct a phylogenetic tree based on the CPASS comparative scores (http://itol.embl.de/shared/jcatazaro). The resulting phylogenetic tree identified a sequential, step-wise evolution of protein active-sites and provides novel insights into the emergence of protein function or changes in substrate specificity based on subtle changes in geometry and amino acid composition.
蛋白质的功能进化是通过基因复制,然后是功能漂移来实现的,而分子进化则是通过随机的突变事件发生的。随着时间的推移,蛋白质活性位点结构或功能表位仍然高度保守,这使得我们能够推断出遥远的同源物或同系物之间的关系。在这项研究中,我们根据活性位点结构之间的相似性,展示了基于 RCSB 蛋白质数据库 (RCSB PDB) 的第一个功能聚类和进化分析。使用我们的蛋白质活性位点结构比较 (CPASS) 软件和数据库(http://cpass.unl.edu/)对 RCSB PDB 中的所有配体结合蛋白进行了评分。然后使用主成分分析来识别 4431 个代表结构,以基于 CPASS 比较分数构建系统发育树(http://itol.embl.de/shared/jcatazaro)。所得的系统发育树鉴定了蛋白质活性位点的顺序、逐步进化,并基于几何形状和氨基酸组成的细微变化,为蛋白质功能的出现或底物特异性的变化提供了新的见解。