Hwang Taeyoung, Park Taesung
Interdisciplinary Program of Bioinformatics, Seoul National University, Seoul, Republic of Korea.
BMC Bioinformatics. 2009 Apr 30;10:128. doi: 10.1186/1471-2105-10-128.
Since high-throughput protein-protein interaction (PPI) data has recently become available for humans, there has been a growing interest in combining PPI data with other genome-wide data. In particular, the identification of phenotype-related PPI subnetworks using gene expression data has been of great concern. Successful integration for the identification of significant subnetworks requires the use of a search algorithm with a proper scoring method. Here we propose a multivariate analysis of variance (MANOVA)-based scoring method with a greedy search for identifying differentially expressed PPI subnetworks.
Given the MANOVA-based scoring method, we performed a greedy search to identify the subnetworks with the maximum scores in the PPI network. Our approach was successfully applied to human microarray datasets. Each identified subnetwork was annotated with the Gene Ontology (GO) term, resulting in the phenotype-related functional pathway or complex. We also compared these results with those of other scoring methods such as t statistic- and mutual information-based scoring methods. The MANOVA-based method produced subnetworks with a larger number of proteins than the other methods. Furthermore, the subnetworks identified by the MANOVA-based method tended to consist of highly correlated proteins.
This article proposes a MANOVA-based scoring method to combine PPI data with expression data using a greedy search. This method is recommended for the highly sensitive detection of large subnetworks.
由于高通量蛋白质-蛋白质相互作用(PPI)数据最近已可用于人类,将PPI数据与其他全基因组数据相结合的兴趣日益浓厚。特别是,利用基因表达数据识别与表型相关的PPI子网一直备受关注。成功整合以识别重要子网需要使用具有适当评分方法的搜索算法。在此,我们提出一种基于多变量方差分析(MANOVA)的评分方法,并通过贪婪搜索来识别差异表达的PPI子网。
基于MANOVA评分方法,我们进行了贪婪搜索,以在PPI网络中识别得分最高的子网。我们的方法成功应用于人类微阵列数据集。每个识别出的子网都用基因本体(GO)术语进行注释,从而得到与表型相关的功能途径或复合物。我们还将这些结果与其他评分方法(如基于t统计量和互信息的评分方法)的结果进行了比较。基于MANOVA的方法产生的子网比其他方法包含更多的蛋白质。此外,基于MANOVA方法识别出的子网倾向于由高度相关的蛋白质组成。
本文提出了一种基于MANOVA的评分方法,通过贪婪搜索将PPI数据与表达数据相结合。该方法推荐用于高灵敏度检测大型子网。