Mordelet Fantine, Vert Jean-Philippe
Ecole des Mines de Paris, ParisTech, Fontainebleau, France.
Bioinformatics. 2008 Aug 15;24(16):i76-82. doi: 10.1093/bioinformatics/btn273.
Living cells are the product of gene expression programs that involve the regulated transcription of thousands of genes. The elucidation of transcriptional regulatory networks is thus needed to understand the cell's working mechanism, and can for example, be useful for the discovery of novel therapeutic targets. Although several methods have been proposed to infer gene regulatory networks from gene expression data, a recent comparison on a large-scale benchmark experiment revealed that most current methods only predict a limited number of known regulations at a reasonable precision level.
We propose SIRENE (Supervised Inference of Regulatory Networks), a new method for the inference of gene regulatory networks from a compendium of expression data. The method decomposes the problem of gene regulatory network inference into a large number of local binary classification problems, that focus on separating target genes from non-targets for each transcription factor. SIRENE is thus conceptually simple and computationally efficient. We test it on a benchmark experiment aimed at predicting regulations in Escherichia coli, and show that it retrieves of the order of 6 times more known regulations than other state-of-the-art inference methods.
All data and programs are freely available at http://cbio. ensmp.fr/sirene.
活细胞是基因表达程序的产物,这些程序涉及数千个基因的调控转录。因此,需要阐明转录调控网络以了解细胞的工作机制,例如,这对于发现新的治疗靶点可能是有用的。尽管已经提出了几种从基因表达数据推断基因调控网络的方法,但最近在大规模基准实验上的比较表明,大多数当前方法仅能在合理的精度水平上预测有限数量的已知调控。
我们提出了SIRENE(监管网络的监督推理),一种从表达数据汇编中推断基因调控网络的新方法。该方法将基因调控网络推断问题分解为大量局部二元分类问题,这些问题专注于为每个转录因子将目标基因与非目标基因分开。因此,SIRENE在概念上简单且计算效率高。我们在旨在预测大肠杆菌中调控的基准实验上对其进行了测试,并表明它检索到的已知调控数量比其他现有推理方法多约6倍。
所有数据和程序可在http://cbio. ensmp.fr/sirene免费获取。