Jonsson Pall F, Cavanna Tamara, Zicha Daniel, Bates Paul A
Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, 44 Lincoln's Inn Fields, London WC2A 3PX, UK.
BMC Bioinformatics. 2006 Jan 6;7:2. doi: 10.1186/1471-2105-7-2.
Protein-protein interactions have traditionally been studied on a small scale, using classical biochemical methods to investigate the proteins of interest. More recently large-scale methods, such as two-hybrid screens, have been utilised to survey extensive portions of genomes. Current high-throughput approaches have a relatively high rate of errors, whereas in-depth biochemical studies are too expensive and time-consuming to be practical for extensive studies. As a result, there are gaps in our knowledge of many key biological networks, for which computational approaches are particularly suitable.
We constructed networks, or 'interactomes', of putative protein-protein interactions in the rat proteome--the rat being an organism extensively used for cancer studies. This was achieved by integrating experimental protein-protein interaction data from many species and translating this data into the reference frame of the rat. The putative rat protein interactions were given confidence scores based on their homology to proteins that have been experimentally observed to interact. The confidence score was furthermore weighted according to the extent of the experimental evidence, giving a higher weight to more frequently observed interactions. The scoring function was subsequently validated and networks constructed around key proteins, identified as being highly up- or down-regulated in rat cell lines of high metastatic potential. Using clustering methods on the networks, we have identified key protein communities involved in cancer metastasis.
The protein network generation and subsequent network analysis used here, were shown to be useful for highlighting key proteins involved in metastasis. This approach, in conjunction with microarray expression data, can be extended to other species, thereby suggesting possible pathways around proteins of interest.
传统上,蛋白质-蛋白质相互作用的研究规模较小,使用经典生化方法来研究感兴趣的蛋白质。最近,诸如双杂交筛选等大规模方法已被用于全面检测基因组的各个部分。当前的高通量方法错误率相对较高,而深入的生化研究成本过高且耗时过长,难以用于大规模研究。因此,我们对许多关键生物网络的认识存在空白,而计算方法在这方面特别适用。
我们构建了大鼠蛋白质组中假定的蛋白质-蛋白质相互作用网络,即“相互作用组”,大鼠是一种广泛用于癌症研究的生物体。这是通过整合来自许多物种的实验性蛋白质-蛋白质相互作用数据,并将这些数据转换到大鼠的参考框架中来实现的。根据与已通过实验观察到相互作用的蛋白质的同源性,为假定的大鼠蛋白质相互作用赋予置信度分数。此外,根据实验证据的程度对置信度分数进行加权,对更频繁观察到的相互作用赋予更高的权重。随后对评分函数进行了验证,并围绕关键蛋白质构建了网络,这些关键蛋白质在具有高转移潜能的大鼠细胞系中被确定为高度上调或下调。通过对网络使用聚类方法,我们确定了参与癌症转移的关键蛋白质群落。
本文所使用的蛋白质网络生成及后续网络分析,被证明对于突出参与转移的关键蛋白质很有用。这种方法与微阵列表达数据相结合,可以扩展到其他物种,从而提示围绕感兴趣蛋白质的可能途径。