Vert Jean Philippe, Kanehisa Minoru
Centre de Géostatistique, Ecole des Mines de Paris, Fontainebleau cedex, France.
Bioinformatics. 2003 Oct;19 Suppl 2:ii238-44. doi: 10.1093/bioinformatics/btg1084.
A promising way to make sense out of gene expression profiles is to relate them to the activity of metabolic and signalling pathways. Each pathway usually involves many genes, such as enzymes, which can themselves participate in many pathways. The set of all known pathways can therefore be represented by a complex network of genes. Searching for regularities in the set of gene expression profiles with respect to the topology of this gene network is a way to automatically extract active pathways and their associated patterns of activity.
We present a method to perform this task, which consists in encoding both the gene network and the set of profiles into two kernel functions, and performing a regularized form of canonical correlation analysis between the two kernels.
When applied to publicly available expression data the method is able to extract biologically relevant expression patterns, as well as pathways with related activity.
理解基因表达谱的一种有前景的方法是将它们与代谢和信号通路的活性联系起来。每个通路通常涉及许多基因,比如酶,而这些酶自身可能参与多个通路。因此,所有已知通路的集合可以由一个复杂的基因网络来表示。针对这个基因网络的拓扑结构,在基因表达谱集合中寻找规律是自动提取活跃通路及其相关活性模式的一种方法。
我们提出一种执行此任务的方法,该方法包括将基因网络和谱集都编码为两个核函数,并在这两个核之间进行正则化形式的典型相关分析。
当应用于公开可用的表达数据时,该方法能够提取生物学上相关的表达模式以及具有相关活性的通路。