Liu Lu, Ruan Jianhua
Department of Computer Science, The University of Texas at San Antonio, San Antonio, USA.
Proceedings (IEEE Int Conf Bioinformatics Biomed). 2013:218-221. doi: 10.1109/BIBM.2013.6732493.
Finding out the associations between an input gene set, such as genes associated with a certain phenotype, and annotated gene sets, such as known pathways, are a very important problem in modern molecular biology. The existing approaches mainly focus on the overlap between the two, and may miss important but subtle relationships between genes. In this paper, we propose a method, NetPEA, by combining the known pathways and high-throughput networks. Our method not only considers the shared genes, but also takes the gene interactions into account. It utilizes a protein-protein interaction network and a random walk procedure to identify hidden relationships between gene sets, and uses a randomization strategy to evaluate the significance for pathways to achieve such similarity scores. Compared with the over-representation based method, our method can identify more relationships. Compared with a state of the art network-based method, EnrichNet, our method not only provides a ranked list of pathways, but also provides the statistical significant information. Importantly, through independent tests, we show that our method likely has a higher sensitivity in revealing the true casual pathways, while at the same time achieve a higher specificity. Literature review of selected results indicates that some of the novel pathways reported by our method are biologically relevant and important.
找出输入基因集(如与某种表型相关的基因)与注释基因集(如已知通路)之间的关联,是现代分子生物学中一个非常重要的问题。现有方法主要关注两者之间的重叠,可能会忽略基因之间重要但微妙的关系。在本文中,我们提出了一种方法NetPEA,通过结合已知通路和高通量网络。我们的方法不仅考虑共享基因,还考虑基因相互作用。它利用蛋白质-蛋白质相互作用网络和随机游走程序来识别基因集之间的隐藏关系,并使用随机化策略来评估通路达到此类相似性得分的显著性。与基于过度表达的方法相比,我们的方法可以识别更多关系。与一种基于网络的先进方法EnrichNet相比,我们的方法不仅提供通路的排名列表,还提供统计显著信息。重要的是,通过独立测试,我们表明我们的方法在揭示真正的因果通路时可能具有更高的敏感性,同时实现更高的特异性。对所选结果的文献综述表明,我们的方法报告的一些新通路具有生物学相关性且很重要。