Hong Yoojin, Chalkia Dimitra, Ko Kyung Dae, Bhardwaj Gaurav, Chang Gue Su, van Rossum Damian B, Patterson Randen L
Center for Computational Proteomics, The Pennsylvania State University.
J Proteomics Bioinform. 2009 Mar 21;2:139-149. doi: 10.4172/jpb.1000071.
One of the major challenges in the genomic era is annotating structure/function to the vast quantities of sequence information now available. Indeed, most of the protein sequence database lacks comprehensive annotation, even when experimental evidence exists. Further, within structurally resolved and functionally annotated protein domains, additional functionalities contained in these domains are not apparent. To add further complication, small changes in the amino-acid sequence can lead to profound changes in both structure and function, underscoring the need for rapid and reliable methods to analyze these types of data. Phylogenetic profiles provide a quantitative method that can relate the structural and functional properties of proteins, as well as their evolutionary relationships. Using all of the structurally resolved Src-Homology-2 (SH2) domains, we demonstrate that knowledge-bases can be used to create single-amino acid phylogenetic profiles which reliably annotate lipid-binding. Indeed, these measures isolate the known phosphotyrosine and hydrophobic pockets as integral to lipid-binding function. In addition, we determined that the SH2 domain of Tec family kinases bind to lipids with varying affinity and specificity. Simulating mutations in Bruton's tyrosine kinase (BTK) that cause X-Linked Agammaglobulinemia (XLA) predict that these mutations alter lipid-binding, which we confirm experimentally. In light of these results, we propose that XLA-causing mutations in the SH3-SH2 domain of BTK alter lipid-binding, which could play a causative role in the XLA-phenotype. Overall, our study suggests that the number of lipid-binding proteins is drastically underestimated and, with further development, phylogenetic profiles can provide a method for rapidly increasing the functional annotation of protein sequences.
基因组时代的主要挑战之一是为现有的大量序列信息赋予结构/功能注释。事实上,即使存在实验证据,大多数蛋白质序列数据库也缺乏全面的注释。此外,在结构已解析且功能已注释的蛋白质结构域中,这些结构域中包含的其他功能并不明显。更复杂的是,氨基酸序列的微小变化可能导致结构和功能的深刻变化,这突出了对快速可靠的方法来分析此类数据的需求。系统发育谱提供了一种定量方法,可以关联蛋白质的结构和功能特性及其进化关系。利用所有结构已解析的Src同源2(SH2)结构域,我们证明知识库可用于创建可靠注释脂质结合的单氨基酸系统发育谱。事实上,这些方法将已知的磷酸酪氨酸和疏水口袋分离出来,作为脂质结合功能所必需的。此外,我们确定Tec家族激酶的SH2结构域以不同的亲和力和特异性结合脂质。模拟导致X连锁无丙种球蛋白血症(XLA)的布鲁顿酪氨酸激酶(BTK)中的突变预测这些突变会改变脂质结合,我们通过实验证实了这一点。鉴于这些结果,我们提出BTK的SH3-SH2结构域中导致XLA的突变会改变脂质结合,这可能在XLA表型中起致病作用。总体而言,我们的研究表明脂质结合蛋白的数量被严重低估,并且随着进一步发展,系统发育谱可以提供一种快速增加蛋白质序列功能注释的方法。