Jefferson Emily R, Walsh Thomas P, Barton Geoffrey J
University of Dundee, School of Life Sciences, Dow Street, Dundee, DD1 5EH Scotland, United Kingdom.
Proteins. 2008 Jan 1;70(1):54-62. doi: 10.1002/prot.21496.
The analysis and prediction of protein-protein interaction sites from structural data are restricted by the limited availability of structural complexes that represent the complete protein-protein interaction space. The domain classification schemes CATH and SCOP are normally used independently in the analysis and prediction of protein domain-domain interactions. In this article, the effect of different domain classification schemes on the number and type of domain-domain interactions observed in structural data is systematically evaluated for the SCOP and CATH hierarchies. Although there is a large overlap in domain assignments between SCOP and CATH, 23.6% of CATH interfaces had no SCOP equivalent and 37.3% of SCOP interfaces had no CATH equivalent in a nonredundant set. Therefore, combining both classifications gives an increase of between 23.6 and 37.3% in domain-domain interfaces. It is suggested that if possible, both domain classification schemes should be used together, but if only one is selected, SCOP provides better coverage than CATH. Employing both SCOP and CATH reduces the false negative rate of predictive methods, which employ homology matching to structural data to predict protein-protein interaction by an estimated 6.5%.
从结构数据中分析和预测蛋白质-蛋白质相互作用位点,受到代表完整蛋白质-蛋白质相互作用空间的结构复合物可用性有限的限制。结构域分类方案CATH和SCOP通常在蛋白质结构域-结构域相互作用的分析和预测中独立使用。在本文中,针对SCOP和CATH层次结构,系统评估了不同结构域分类方案对结构数据中观察到的结构域-结构域相互作用的数量和类型的影响。尽管SCOP和CATH在结构域分配上有很大重叠,但在一个非冗余数据集中,23.6%的CATH界面没有对应的SCOP界面,37.3%的SCOP界面没有对应的CATH界面。因此,结合两种分类方法可使结构域-结构域界面数量增加23.6%至37.3%。建议如果可能的话,应同时使用两种结构域分类方案,但如果只选择一种,SCOP比CATH具有更好的覆盖范围。同时使用SCOP和CATH可降低预测方法的假阴性率,这些预测方法通过对结构数据进行同源匹配来预测蛋白质-蛋白质相互作用,估计可降低6.5%。