Bleakley Kevin, Biau Gérard, Vert Jean-Philippe
Institut de Mathématiques et de Modélisation de Montpellier, UMR CNRS 5149, Equipe de Probabilités et Statistique, Université Montpellier II, CC 051, Place Eugène Bataillon, 34095 Montpellier Cedex 5, France.
Bioinformatics. 2007 Jul 1;23(13):i57-65. doi: 10.1093/bioinformatics/btm204.
Inference and reconstruction of biological networks from heterogeneous data is currently an active research subject with several important applications in systems biology. The problem has been attacked from many different points of view with varying degrees of success. In particular, predicting new edges with a reasonable false discovery rate is highly demanded for practical applications, but remains extremely challenging due to the sparsity of the networks of interest.
While most previous approaches based on the partial knowledge of the network to be inferred build global models to predict new edges over the network, we introduce here a novel method which predicts whether there is an edge from a newly added vertex to each of the vertices of a known network using local models. This involves learning individually a certain subnetwork associated with each vertex of the known network, then using the discovered classification rule associated with only that vertex to predict the edge to the new vertex. Excellent experimental results are shown in the case of metabolic and protein-protein interaction network reconstruction from a variety of genomic data.
An implementation of the proposed algorithm is available upon request from the authors.
从异构数据推断和重建生物网络是当前系统生物学中一个活跃的研究课题,在多个重要应用领域都有涉及。该问题已从许多不同角度进行研究,取得了不同程度的成功。特别是,在实际应用中,以合理的错误发现率预测新边的需求很高,但由于感兴趣网络的稀疏性,这仍然极具挑战性。
虽然之前大多数基于对要推断网络的部分知识构建全局模型来预测网络上新边的方法,但我们在此引入一种新颖的方法,该方法使用局部模型预测从新添加的顶点到已知网络中每个顶点是否存在边。这涉及为已知网络的每个顶点单独学习某个相关子网,然后使用仅与该顶点相关的发现分类规则来预测到新顶点的边。在从各种基因组数据重建代谢和蛋白质 - 蛋白质相互作用网络的情况下,展示了出色的实验结果。
可根据作者要求提供所提算法的实现。