Weskamp Nils, Hüllermeier Eyke, Kuhn Daniel, Klebe Gerhard
Department of Mathematics and Computer Science and The Institute of Phamaceutical Chemistry, University of Marburg, Hans-Meerwein-Strasse, Marburg, Germany.
IEEE/ACM Trans Comput Biol Bioinform. 2007 Apr-Jun;4(2):310-20. doi: 10.1109/tcbb.2007.358301.
Graphs are frequently used to describe the geometry and also the physicochemical composition of protein active sites. Here, the concept of graph alignment as a novel method for the structural analysis of protein binding pockets is presented. Using inexact graph-matching techniques, one is able to identify both conserved areas and regions of difference among different binding pockets. Thus, using multiple graph alignments, it is possible to characterize functional protein families and to examine differences among related protein families independent of sequence or fold homology. Optimized algorithms are described for the efficient calculation of multiple graph alignments for the analysis of physicochemical descriptors representing protein binding pockets. Additionally, it is shown how the calculated graph alignments can be analyzed to identify structural features that are characteristic for a given protein family and also features that are discriminative among related families. The methods are applied to a substantial high-quality subset of the PDB database and their ability to successfully characterize and classify 10 highly populated functional protein families is shown. Additionally, two related protein families from the group of serine proteases are examined and important structural differences are detected automatically and efficiently.
图表经常被用于描述蛋白质活性位点的几何结构以及物理化学组成。在此,提出了将图表比对作为一种用于蛋白质结合口袋结构分析的新方法的概念。使用不精确的图表匹配技术,能够识别不同结合口袋之间的保守区域和差异区域。因此,通过使用多重图表比对,有可能对功能性蛋白质家族进行表征,并独立于序列或折叠同源性来研究相关蛋白质家族之间的差异。描述了用于高效计算多重图表比对的优化算法,以分析代表蛋白质结合口袋的物理化学描述符。此外,还展示了如何对计算得到的图表比对进行分析,以识别给定蛋白质家族特有的结构特征以及相关家族之间具有区分性的特征。这些方法应用于蛋白质数据银行(PDB)数据库的一个高质量的大量子集,并展示了它们成功表征和分类10个高度密集的功能性蛋白质家族的能力。此外,还研究了丝氨酸蛋白酶组中的两个相关蛋白质家族,并自动且高效地检测到了重要的结构差异。