Suppr超能文献

基于结构域组合的蛋白质-蛋白质相互作用预测方法的种间验证

Inter-species validation for domain combination based protein-protein interaction prediction method.

作者信息

Jang Woo-Hyuk, Han Dong-Soo, Kim Hong-Soog, Lee Sung-Doke

机构信息

School of Engineering, Information and Communications University, 119, Munjiro, Daejeon, 305-714, Korea.

出版信息

Genome Inform. 2005;16(2):136-47.

Abstract

Domain Combination based Protein-Protein Interaction Prediction (DCPPIP) method is revealed to show outstanding prediction accuracy in Yeast proteins. However, it is not yet apparent whether the method is still valid and can achieve comparable prediction accuracy for the proteins in other species. In this paper, we report the validation results of applying the DCPPIP method for Fly and Human proteins. We also report the results of inter-species validation, in which protein interaction and domain data of other species are used as learning set. 10,351 interacting protein pairs are used for the validation for Fly, 2,345 protein pairs for Human. 80% of the data are used as learning sets and 20% are reserved as test sets. High prediction accuracies (Fly: sensitivity approximately 77%, specificity approximately 92%, Human: sensitivity approximately 96%, specificity approximately 95%) are achieved in both Fly and Human cases. Interactions of proteins in Human, Mouse, H. pylori, E. coli, and C. elegans are predicted and validated using the protein interaction and domain data in Yeast, Fly, and the combination of Yeast and Fly respectively. Again, good prediction accuracy is achieved when the test protein pair has common domains with the proteins in a learning set of proteins. A notion of Domain Overlapping Rate (DOR) among species is newly developed in this paper and the correlation between DOR and prediction accuracy is examined. According to out test results, there exists fairly obvious correlation between DOR and prediction accuracy.

摘要

基于结构域组合的蛋白质-蛋白质相互作用预测(DCPPIP)方法在酵母蛋白质中显示出出色的预测准确性。然而,该方法对于其他物种的蛋白质是否仍然有效并能达到可比的预测准确性尚不清楚。在本文中,我们报告了将DCPPIP方法应用于果蝇和人类蛋白质的验证结果。我们还报告了种间验证的结果,其中将其他物种的蛋白质相互作用和结构域数据用作学习集。10351对相互作用的蛋白质对用于果蝇的验证,2345对蛋白质对用于人类的验证。80%的数据用作学习集,20%留作测试集。在果蝇和人类的案例中均实现了较高的预测准确性(果蝇:灵敏度约为77%,特异性约为92%;人类:灵敏度约为96%,特异性约为95%)。分别使用酵母、果蝇中的蛋白质相互作用和结构域数据,以及酵母和果蝇的组合数据,对人类、小鼠、幽门螺杆菌、大肠杆菌和秀丽隐杆线虫中的蛋白质相互作用进行预测和验证。同样,当测试蛋白质对与蛋白质学习集中的蛋白质具有共同结构域时,可实现良好的预测准确性。本文新提出了物种间结构域重叠率(DOR)的概念,并研究了DOR与预测准确性之间的相关性。根据我们的测试结果,DOR与预测准确性之间存在相当明显的相关性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验