Bioinformatics Center, Institute for Chemical Research, Kyoto University, Japan.
Bioinformatics. 2010 Sep 1;26(17):2128-35. doi: 10.1093/bioinformatics/btq344. Epub 2010 Jun 29.
An observed metabolic response is the result of the coordinated activation and interaction between multiple genetic pathways. However, the complex structure of metabolism has meant that a compete understanding of which pathways are required to produce an observed metabolic response is not fully understood. In this article, we propose an approach that can identify the genetic pathways which dictate the response of metabolic network to specific experimental conditions.
Our approach is a combination of probabilistic models for pathway ranking, clustering and classification. First, we use a non-parametric pathway extraction method to identify the most highly correlated paths through the metabolic network. We then extract the defining structure within these top-ranked pathways using both Markov clustering and classification algorithms. Furthermore, we define detailed node and edge annotations, which enable us to track each pathway, not only with respect to its genetic dependencies, but also allow for an analysis of the interacting reactions, compounds and KEGG sub-networks. We show that our approach identifies biologically meaningful pathways within two microarray expression datasets using entire KEGG metabolic networks.
An R package containing a full implementation of our proposed method is currently available from http://www.bic.kyoto-u.ac.jp/pathway/timhancock.
观察到的代谢反应是多个遗传途径协调激活和相互作用的结果。然而,代谢的复杂结构意味着,对于产生观察到的代谢反应所需的途径,我们还没有完全理解。在本文中,我们提出了一种方法,可以识别决定代谢网络对特定实验条件反应的遗传途径。
我们的方法是一种用于途径排序、聚类和分类的概率模型的组合。首先,我们使用一种非参数途径提取方法来识别代谢网络中最相关的路径。然后,我们使用马尔可夫聚类和分类算法从这些排名最高的途径中提取定义结构。此外,我们定义了详细的节点和边注释,使我们不仅可以跟踪每条途径与其遗传依赖性的关系,还可以分析相互作用的反应、化合物和 KEGG 子网络。我们表明,我们的方法使用整个 KEGG 代谢网络在两个微阵列表达数据集内识别出具有生物学意义的途径。
一个包含我们提出的方法的完整实现的 R 包可从 http://www.bic.kyoto-u.ac.jp/pathway/timhancock 获得。