Chen Jin, Hsu Wynne, Lee Mong Li, Ng See-Kiong
School of Computing, National University of Singapore, Singapore 119260.
Bioinformatics. 2006 Aug 15;22(16):1998-2004. doi: 10.1093/bioinformatics/btl335. Epub 2006 Jun 20.
Experimental limitations in high-throughput protein-protein interaction detection methods have resulted in low quality interaction datasets that contained sizable fractions of false positives and false negatives. Small-scale, focused experiments are then needed to complement the high-throughput methods to extract true protein interactions. However, the naturally vast interactomes would require much more scalable approaches.
We describe a novel method called IRAP* as a computational complement for repurification of the highly erroneous experimentally derived protein interactomes. Our method involves an iterative process of removing interactions that are confidently identified as false positives and adding interactions detected as false negatives into the interactomes. Identification of both false positives and false negatives are performed in IRAP* using interaction confidence measures based on network topological metrics. Potential false positives are identified amongst the detected interactions as those with very low computed confidence values, while potential false negatives are discovered as the undetected interactions with high computed confidence values. Our results from applying IRAP* on large-scale interaction datasets generated by the popular yeast-two-hybrid assays for yeast, fruit fly and worm showed that the computationally repurified interaction datasets contained potentially lower fractions of false positive and false negative errors based on functional homogeneity.
The confidence indices for PPIs in yeast, fruit fly and worm as computed by our method can be found at our website http://www.comp.nus.edu.sg/~chenjin/fpfn.
高通量蛋白质 - 蛋白质相互作用检测方法存在实验局限性,导致相互作用数据集质量较低,其中包含相当比例的假阳性和假阴性。因此需要小规模的针对性实验来补充高通量方法,以提取真实的蛋白质相互作用。然而,天然存在的庞大相互作用组需要更具扩展性的方法。
我们描述了一种名为IRAP的新方法,作为对高度错误的实验性蛋白质相互作用组进行重新纯化的计算补充方法。我们的方法涉及一个迭代过程,即去除被确定为假阳性的相互作用,并将被检测为假阴性的相互作用添加到相互作用组中。在IRAP中,使用基于网络拓扑指标的相互作用置信度度量来识别假阳性和假阴性。在检测到的相互作用中,将计算置信度值非常低的那些确定为潜在的假阳性,而将计算置信度值高但未检测到的相互作用发现为潜在的假阴性。我们将IRAP*应用于酵母、果蝇和线虫的流行酵母双杂交试验生成的大规模相互作用数据集的结果表明,基于功能同质性,经计算重新纯化的相互作用数据集包含的假阳性和假阴性错误比例可能更低。
我们的方法计算出的酵母、果蝇和线虫中蛋白质 - 蛋白质相互作用的置信指数可在我们的网站http://www.comp.nus.edu.sg/~chenjin/fpfn上找到。