Kantorovitz Miriam R, Kazemian Majid, Kinston Sarah, Miranda-Saavedra Diego, Zhu Qiyun, Robinson Gene E, Göttgens Berthold, Halfon Marc S, Sinha Saurabh
Department of Mathematics, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.
Dev Cell. 2009 Oct;17(4):568-79. doi: 10.1016/j.devcel.2009.09.002.
We present new approaches to cis-regulatory module (CRM) discovery in the common scenario where relevant transcription factors and/or motifs are unknown. Beginning with a small list of CRMs mediating a common gene expression pattern, we search genome-wide for CRMs with similar functionality, using new statistical scores and without requiring known motifs or accurate motif discovery. We cross-validate our predictions on 31 regulatory networks in Drosophila and through correlations with gene expression data. Five predicted modules tested using an in vivo reporter gene assay all show tissue-specific regulatory activity. We also demonstrate our methods' ability to predict mammalian tissue-specific enhancers. Finally, we predict human CRMs that regulate early blood and cardiovascular development. In vivo transgenic mouse analysis of two predicted CRMs demonstrates that both have appropriate enhancer activity. Overall, 7/7 predictions were validated successfully in vivo, demonstrating the effectiveness of our approach for insect and mammalian genomes.
我们提出了在相关转录因子和/或基序未知的常见情况下发现顺式调控模块(CRM)的新方法。从介导共同基因表达模式的一小部分CRM开始,我们使用新的统计评分在全基因组范围内搜索具有相似功能的CRM,无需已知基序或准确的基序发现。我们在果蝇的31个调控网络上对预测结果进行交叉验证,并通过与基因表达数据的相关性进行验证。使用体内报告基因检测法测试的五个预测模块均显示出组织特异性调控活性。我们还展示了我们的方法预测哺乳动物组织特异性增强子的能力。最后,我们预测了调控早期血液和心血管发育的人类CRM。对两个预测的CRM进行的体内转基因小鼠分析表明,两者都具有适当的增强子活性。总体而言,7个预测中有7个在体内成功得到验证,证明了我们的方法对昆虫和哺乳动物基因组的有效性。