Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America.
PLoS One. 2013 May 31;8(5):e64832. doi: 10.1371/journal.pone.0064832. Print 2013.
Regulatory network reconstruction is a fundamental problem in computational biology. There are significant limitations to such reconstruction using individual datasets, and increasingly people attempt to construct networks using multiple, independent datasets obtained from complementary sources, but methods for this integration are lacking. We developed PANDA (Passing Attributes between Networks for Data Assimilation), a message-passing model using multiple sources of information to predict regulatory relationships, and used it to integrate protein-protein interaction, gene expression, and sequence motif data to reconstruct genome-wide, condition-specific regulatory networks in yeast as a model. The resulting networks were not only more accurate than those produced using individual data sets and other existing methods, but they also captured information regarding specific biological mechanisms and pathways that were missed using other methodologies. PANDA is scalable to higher eukaryotes, applicable to specific tissue or cell type data and conceptually generalizable to include a variety of regulatory, interaction, expression, and other genome-scale data. An implementation of the PANDA algorithm is available at www.sourceforge.net/projects/panda-net.
调控网络重构是计算生物学中的一个基本问题。仅使用单个数据集进行此类重构存在重大限制,并且越来越多的人试图使用来自互补来源的多个独立数据集来构建网络,但缺乏这种集成的方法。我们开发了 PANDA(在网络之间传递属性以进行数据同化),这是一种使用多种信息源来预测调控关系的消息传递模型,并将其用于整合蛋白质-蛋白质相互作用、基因表达和序列基序数据,以重建酵母作为模型的全基因组、特定条件的调控网络。生成的网络不仅比使用单个数据集和其他现有方法产生的网络更准确,而且还捕获了使用其他方法学错过的特定生物学机制和途径的信息。PANDA 可扩展到高等真核生物,适用于特定组织或细胞类型的数据,并且在概念上可推广到包括各种调控、相互作用、表达和其他基因组规模的数据。PANDA 算法的实现可在 www.sourceforge.net/projects/panda-net 上获得。