Mrowka Ralf, Blüthgen Nils, Fähling Michael
Paul-Ehrlich-Zentrum für Experimentelle Medizin, AG Systems Biology-Computational Physiology, Tucholskystrasse 2, Berlin, Germany.
FEBS J. 2008 Jun;275(12):3178-92. doi: 10.1111/j.1742-4658.2008.06471.x. Epub 2008 May 13.
Reliable prediction of specific transcription factor target genes is a major challenge in systems biology and functional genomics. Current sequence-based methods yield many false predictions, due to the short and degenerated DNA-binding motifs. Here, we describe a new systematic genome-wide approach, the seed-distribution-distance method, that searches large-scale genome-wide expression data for genes that are similarly expressed as known targets. This method is used to identify genes that are likely targets, allowing sequence-based methods to focus on a subset of genes, giving rise to fewer false-positive predictions. We show by cross-validation that this method is robust in recovering specific target genes. Furthermore, this method identifies genes with typical functions and binding motifs of the seed. The method is illustrated by predicting novel targets of the transcription factor nuclear factor kappaB (NF-kappaB). Among the new targets is optineurin, which plays a key role in the pathogenesis of acquired blindness caused by adult-onset primary open-angle glaucoma. We show experimentally that the optineurin gene and other predicted genes are targets of NF-kappaB. Thus, our data provide a missing link in the signalling of NF-kappaB and the damping function of optineurin in signalling feedback of NF-kappaB. We present a robust and reliable method to enhance the genome-wide prediction of specific transcription factor target genes that exploits the vast amount of expression information available in public databases today.
在系统生物学和功能基因组学中,可靠地预测特定转录因子的靶基因是一项重大挑战。由于DNA结合基序短且具有简并性,当前基于序列的方法会产生许多错误预测。在此,我们描述了一种新的全基因组系统方法——种子分布距离法,该方法在大规模全基因组表达数据中搜索与已知靶标表达相似的基因。此方法用于识别可能的靶基因,使基于序列的方法能够专注于基因子集,从而减少假阳性预测。我们通过交叉验证表明,该方法在恢复特定靶基因方面具有稳健性。此外,该方法还能识别具有种子典型功能和结合基序的基因。通过预测转录因子核因子κB(NF-κB)的新靶标对该方法进行了说明。新靶标之一是视紫质,它在成人原发性开角型青光眼导致的后天性失明发病机制中起关键作用。我们通过实验表明视紫质基因和其他预测基因是NF-κB的靶标。因此,我们的数据在NF-κB信号传导与视紫质在NF-κB信号反馈中的阻尼功能之间提供了缺失的环节。我们提出了一种稳健可靠的方法来增强特定转录因子靶基因的全基因组预测,该方法利用了当今公共数据库中可用的大量表达信息。