Mahdavi Mahmood A, Lin Yen-Han
Department of Chemical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada.
Genomics Proteomics Bioinformatics. 2007 Dec;5(3-4):177-86. doi: 10.1016/S1672-0229(08)60005-4.
Protein domains are conserved and functionally independent structures that play an important role in interactions among related proteins. Domain-domain interactions have been recently used to predict protein-protein interactions (PPI). In general, the interaction probability of a pair of domains is scored using a trained scoring function. Satisfying a threshold, the protein pairs carrying those domains are regarded as "interacting". In this study, the signature contents of proteins were utilized to predict PPI pairs in Saccharomyces cerevisiae, Caenorhabditis elegans, and Homo sapiens. Similarity between protein signature patterns was scored and PPI predictions were drawn based on the binary similarity scoring function. Results show that the true positive rate of prediction by the proposed approach is approximately 32% higher than that using the maximum likelihood estimation method when compared with a test set, resulting in 22% increase in the area under the receiver operating characteristic (ROC) curve. When proteins containing one or two signatures were removed, the sensitivity of the predicted PPI pairs increased significantly. The predicted PPI pairs are on average 11 times more likely to interact than the random selection at a confidence level of 0.95, and on average 4 times better than those predicted by either phylogenetic profiling or gene expression profiling.
蛋白质结构域是保守且功能独立的结构,在相关蛋白质间的相互作用中发挥重要作用。结构域-结构域相互作用近来已被用于预测蛋白质-蛋白质相互作用(PPI)。一般而言,一对结构域的相互作用概率通过训练后的评分函数进行打分。满足某个阈值时,携带这些结构域的蛋白质对被视为“相互作用”。在本研究中,利用蛋白质的特征内容来预测酿酒酵母、秀丽隐杆线虫和智人中的PPI对。对蛋白质特征模式之间的相似性进行打分,并基于二元相似性评分函数得出PPI预测结果。结果表明,与测试集相比,所提方法的预测真阳性率比使用最大似然估计方法的高出约32%,使得受试者工作特征(ROC)曲线下面积增加了22%。当去除含有一个或两个特征的蛋白质时,预测的PPI对的敏感性显著提高。在0.95的置信水平下,预测的PPI对比随机选择的平均相互作用可能性高11倍,且平均比通过系统发育谱分析或基因表达谱分析预测的结果好4倍。