CellNetworks Cluster of Excellence, University of Heidelberg, Germany.
Methods. 2012 Dec;58(4):343-8. doi: 10.1016/j.ymeth.2012.07.028. Epub 2012 Aug 4.
Negative protein-protein interaction datasets are needed for training and evaluation of interaction prediction methods, as well as validation of high-throughput interaction discovery experiments. In large-scale two-hybrid assays, the direct interaction of a large number of protein pairs is systematically probed. We present a simple method to harness two-hybrid data to obtain negative protein-protein interaction datasets, which we validated using other available experimental data. The method identifies interactions that were likely tested but not observed in a two-hybrid screen. For each negative interaction, a confidence score is defined as the shortest-path length between the two proteins in the interaction network derived from the two-hybrid experiment. We show that these high-quality negative datasets are particularly important when a specific biological context is considered, such as in the study of protein interaction specificity. We also illustrate the use of a negative dataset in the evaluation of the InterPreTS interaction prediction method.
负蛋白-蛋白相互作用数据集对于训练和评估相互作用预测方法以及验证高通量相互作用发现实验都是必要的。在大规模双杂交实验中,系统地探测了大量蛋白质对的直接相互作用。我们提出了一种利用双杂交数据获得负蛋白-蛋白相互作用数据集的简单方法,并使用其他可用的实验数据对其进行了验证。该方法确定了那些可能在双杂交筛选中进行了测试但未被观察到的相互作用。对于每个负相互作用,定义了一个置信分数,作为从双杂交实验中得出的相互作用网络中两个蛋白质之间的最短路径长度。我们表明,当考虑特定的生物学背景时,这些高质量的负数据集尤为重要,例如在研究蛋白质相互作用特异性时。我们还说明了在评估 InterPreTS 相互作用预测方法时使用负数据集的情况。