Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan.
BMC Genomics. 2010 Dec 1;11 Suppl 3(Suppl 3):S7. doi: 10.1186/1471-2164-11-S3-S7.
Comprehensive exploration of protein-protein interactions is a challenging route to understand biological processes. For efficiently enlarging protein interactions annotated with residue-based binding models, we proposed a new concept "3D-domain interolog mapping" with a scoring system to explore all possible protein pairs between the two homolog families, derived from a known 3D-structure dimmer (template), across multiple species. Each family consists of homologous proteins which have interacting domains of the template for studying domain interface evolution of two interacting homolog families.
The 3D-interologs database records the evolution of protein-protein interactions database across multiple species. Based on "3D-domain interolog mapping" and a new scoring function, we infer 173,294 protein-protein interactions by using 1,895 three-dimensional (3D) structure heterodimers to search the UniProt database (4,826,134 protein sequences). The 3D- interologs database comprises 15,124 species and 283,980 protein-protein interactions, including 173,294 interactions (61%) and 110,686 interactions (39%) summarized from the IntAct database. For a protein-protein interaction, the 3D-interologs database shows functional annotations (e.g. Gene Ontology), interacting domains and binding models (e.g. hydrogen-bond interactions and conserved residues). Additionally, this database provides couple-conserved residues and the interacting evolution by exploring the interologs across multiple species. Experimental results reveal that the proposed scoring function obtains good agreement for the binding affinity of 275 mutated residues from the ASEdb. The precision and recall of our method are 0.52 and 0.34, respectively, by using 563 non-redundant heterodimers to search on the Integr8 database (549 complete genomes).
Experimental results demonstrate that the proposed method can infer reliable physical protein-protein interactions and be useful for studying the protein-protein interaction evolution across multiple species. In addition, the top-ranked strategy and template interface score are able to significantly improve the accuracies of identifying protein-protein interactions in a complete genome. The 3D-interologs database is available at http://3D- interologs.life.nctu.edu.tw.
全面探索蛋白质-蛋白质相互作用是理解生物过程的一条具有挑战性的途径。为了有效地扩大基于残基结合模型注释的蛋白质相互作用,我们提出了一个新的概念“三维结构域同源映射”,并建立了一个评分系统,以探索两个同源家族之间的所有可能的蛋白质对,这些蛋白质对来源于一个已知的三维结构二聚体(模板),跨越多个物种。每个家族都由具有模板相互作用结构域的同源蛋白组成,用于研究两个相互作用的同源家族的结构域界面进化。
3D-interologs 数据库记录了跨物种的蛋白质-蛋白质相互作用数据库的进化。基于“三维结构域同源映射”和一个新的评分函数,我们使用 1895 个三维(3D)结构异二聚体来搜索 UniProt 数据库(4826134 个蛋白质序列),推断出 173294 个蛋白质-蛋白质相互作用。3D-interologs 数据库包含 15124 个物种和 283980 个蛋白质-蛋白质相互作用,其中包括 173294 个相互作用(61%)和 110686 个相互作用(39%),这些相互作用来自于 IntAct 数据库。对于一个蛋白质-蛋白质相互作用,3D-interologs 数据库显示了功能注释(例如基因本体论)、相互作用结构域和结合模型(例如氢键相互作用和保守残基)。此外,该数据库还通过探索跨多个物种的同源映射,提供了对蛋白质相互作用进化的偶联保守残基和相互作用。实验结果表明,所提出的评分函数对 ASEdb 中 275 个突变残基的结合亲和力具有良好的一致性。当使用 563 个非冗余异二聚体搜索 Integr8 数据库(549 个完整基因组)时,我们的方法的精度和召回率分别为 0.52 和 0.34。
实验结果表明,所提出的方法可以推断出可靠的物理蛋白质-蛋白质相互作用,并且有助于研究跨多个物种的蛋白质-蛋白质相互作用进化。此外,基于排名的策略和模板界面评分能够显著提高在完整基因组中识别蛋白质-蛋白质相互作用的准确性。3D-interologs 数据库可在 http://3D-interologs.life.nctu.edu.tw 上获取。