Srihari Sriganesh, Leong Hon Wai
Department of Computer Science, National University of Singapore, Singapore 117417, Singapore.
J Bioinform Comput Biol. 2013 Apr;11(2):1230002. doi: 10.1142/S021972001230002X. Epub 2012 Nov 7.
Complexes of physically interacting proteins are one of the fundamental functional units responsible for driving key biological mechanisms within the cell. Their identification is therefore necessary to understand not only complex formation but also the higher level organization of the cell. With the advent of "high-throughput" techniques in molecular biology, significant amount of physical interaction data has been cataloged from organisms such as yeast, which has in turn fueled computational approaches to systematically mine complexes from the network of physical interactions among proteins (PPI network). In this survey, we review, classify and evaluate some of the key computational methods developed till date for the identification of protein complexes from PPI networks. We present two insightful taxonomies that reflect how these methods have evolved over the years toward improving automated complex prediction. We also discuss some open challenges facing accurate reconstruction of complexes, the crucial ones being the presence of high proportion of errors and noise in current high-throughput datasets and some key aspects overlooked by current complex detection methods. We hope this review will not only help to condense the history of computational complex detection for easy reference but also provide valuable insights to drive further research in this area.
物理相互作用的蛋白质复合物是驱动细胞内关键生物学机制的基本功能单元之一。因此,对其进行识别不仅对于理解复合物的形成,而且对于理解细胞的高级组织都是必要的。随着分子生物学中“高通量”技术的出现,已经从酵母等生物体中整理出了大量的物理相互作用数据,这反过来又推动了计算方法从蛋白质之间的物理相互作用网络(PPI网络)中系统地挖掘复合物。在本次综述中,我们回顾、分类并评估了迄今为止开发的一些用于从PPI网络中识别蛋白质复合物的关键计算方法。我们提出了两种有见地的分类法,反映了这些方法多年来在改进自动复合物预测方面的发展历程。我们还讨论了准确重建复合物面临的一些开放性挑战,其中关键的挑战包括当前高通量数据集中存在高比例的错误和噪声,以及当前复合物检测方法忽略的一些关键方面。我们希望这篇综述不仅有助于浓缩计算复合物检测的历史以供参考,还能为推动该领域的进一步研究提供有价值的见解。