Department of Biomedical Informatics, University of Pittsburth, Pittsburgh, Pennsylvania, United States of America.
PLoS One. 2013 Apr 23;8(4):e61134. doi: 10.1371/journal.pone.0061134. Print 2013.
Genetic and pharmacological perturbation experiments, such as deleting a gene and monitoring gene expression responses, are powerful tools for studying cellular signal transduction pathways. However, it remains a challenge to automatically derive knowledge of a cellular signaling system at a conceptual level from systematic perturbation-response data. In this study, we explored a framework that unifies knowledge mining and data mining towards the goal. The framework consists of the following automated processes: 1) applying an ontology-driven knowledge mining approach to identify functional modules among the genes responding to a perturbation in order to reveal potential signals affected by the perturbation; 2) applying a graph-based data mining approach to search for perturbations that affect a common signal; and 3) revealing the architecture of a signaling system by organizing signaling units into a hierarchy based on their relationships. Applying this framework to a compendium of yeast perturbation-response data, we have successfully recovered many well-known signal transduction pathways; in addition, our analysis has led to many new hypotheses regarding the yeast signal transduction system; finally, our analysis automatically organized perturbed genes as a graph reflecting the architecture of the yeast signaling system. Importantly, this framework transformed molecular findings from a gene level to a conceptual level, which can be readily translated into computable knowledge in the form of rules regarding the yeast signaling system, such as "if genes involved in the MAPK signaling are perturbed, genes involved in pheromone responses will be differentially expressed."
遗传和药理学扰动实验,如删除一个基因并监测基因表达反应,是研究细胞信号转导途径的有力工具。然而,从系统的扰动-响应数据中自动推导出细胞信号系统的概念级知识仍然是一个挑战。在本研究中,我们探索了一个将知识挖掘和数据挖掘统一起来的框架来实现这一目标。该框架包括以下自动化过程:1)应用基于本体的知识挖掘方法来识别对扰动有反应的基因中的功能模块,以揭示潜在的受扰动影响的信号;2)应用基于图的数据挖掘方法来搜索影响共同信号的扰动;3)通过根据它们的关系将信号单元组织成层次结构来揭示信号系统的架构。将这个框架应用于酵母扰动-响应数据的汇编中,我们成功地恢复了许多著名的信号转导途径;此外,我们的分析还导致了许多关于酵母信号转导系统的新假设;最后,我们的分析自动将扰动基因组织成一个反映酵母信号系统架构的图。重要的是,这个框架将分子发现从基因水平转化为概念水平,可以很容易地以关于酵母信号系统的规则形式转化为可计算的知识,例如“如果参与 MAPK 信号的基因受到干扰,那么参与信息素反应的基因将表现出差异表达。”