Kinjo Akira R, Nakamura Haruki
Institute for Protein Research, Osaka University, Osaka, Japan.
Methods Mol Biol. 2013;932:295-315. doi: 10.1007/978-1-62703-065-6_18.
Protein functions are mediated by interactions between proteins and other molecules. One useful approach to analyze protein functions is to compare and classify the structures of interaction interfaces of proteins. Here, we describe the procedures for compiling a database of interface structures and efficiently comparing the interface structures. To do so requires a good understanding of the data structures of the Protein Data Bank (PDB). Therefore, we also provide a detailed account of the PDB exchange dictionary necessary for extracting data that are relevant for analyzing interaction interfaces and secondary structures. We identify recurring structural motifs by classifying similar interface structures, and we define a coarse-grained representation of supersecondary structures (SSS) which represents a sequence of two or three secondary structure elements including their relative orientations as a string of four to seven letters. By examining the correspondence between structural motifs and SSS strings, we show that no SSS string has particularly high propensity to be found interaction interfaces in general, indicating any SSS can be used as a binding interface. When individual structural motifs are examined, there are some SSS strings that have high propensity for particular groups of structural motifs. In addition, it is shown that while the SSS strings found in particular structural motifs for nonpolymer and protein interfaces are as abundant as in other structural motifs that belong to the same subunit, structural motifs for nucleic acid interfaces exhibit somewhat stronger preference for SSS strings. In regard to protein folds, many motif-specific SSS strings were found across many folds, suggesting that SSS may be a useful description to investigate the universality of ligand binding modes.
蛋白质功能是由蛋白质与其他分子之间的相互作用介导的。一种分析蛋白质功能的有用方法是比较和分类蛋白质相互作用界面的结构。在这里,我们描述了编制界面结构数据库并有效比较界面结构的程序。要做到这一点需要很好地理解蛋白质数据库(PDB)的数据结构。因此,我们还详细介绍了提取与分析相互作用界面和二级结构相关数据所需的PDB交换字典。我们通过对相似界面结构进行分类来识别反复出现的结构基序,并定义了超二级结构(SSS)的粗粒度表示,它表示由两到三个二级结构元件组成的序列,包括它们的相对取向,用四到七个字母的字符串表示。通过检查结构基序与SSS字符串之间的对应关系,我们发现一般来说,没有任何SSS字符串在相互作用界面中出现的倾向特别高,这表明任何SSS都可以用作结合界面。当检查单个结构基序时,有一些SSS字符串在特定的结构基序组中具有较高的出现倾向。此外,研究表明,虽然在非聚合物和蛋白质界面的特定结构基序中发现的SSS字符串与属于同一亚基的其他结构基序中一样丰富,但核酸界面的结构基序对SSS字符串表现出更强的偏好。关于蛋白质折叠,在许多折叠中发现了许多特定基序的SSS字符串,这表明SSS可能是研究配体结合模式普遍性的有用描述。