Thompson Jeffrey A, Congdon Clare Bates
Department of Computer Science University of Southern Maine Portland, Maine 04104.
Proc Congr Evol Comput. 2014 Jul;2014. doi: 10.1109/cec.2014.6900542. Epub 2014 Sep 22.
In this work, we extend GAMI (Genetic Algorithms for Motif Inference), a motif inference system, to find sets of motifs that may function as part of a cis-regulatory module (CRM) using a comparative genomics approach. Evidence suggests that most transcription factors binding sites are part of a CRM, so our new approach is expected to yield stronger candidates for inference of candidate regulatory elements and their combinatorial regulation of genes. Thanks to our genetic algorithms based approach, we are able to search relatively large input sequences (100,000nt or longer). Most current computational approaches to identifying candidate CRMs depend on foreknowledge of the processes that the genes they regulate are involved in. In comparison with one leading method, Cluster-Buster, our prototype approach, which we call GAMI-CRM, performed well, suggesting that GAMI-CRM will be particularly useful in predicting CRMs for genes whose interactions are poorly understood.
在这项工作中,我们扩展了基序推断系统GAMI(用于基序推断的遗传算法),采用比较基因组学方法来寻找可能作为顺式调控模块(CRM)一部分发挥作用的基序集。有证据表明,大多数转录因子结合位点都是CRM的一部分,因此我们的新方法有望为推断候选调控元件及其对基因的组合调控产生更强有力的候选对象。得益于我们基于遗传算法的方法,我们能够搜索相对较长的输入序列(100,000nt或更长)。当前大多数识别候选CRM的计算方法都依赖于对其所调控基因参与的过程的先验知识。与一种领先方法Cluster - Buster相比,我们的原型方法(我们称之为GAMI - CRM)表现良好,这表明GAMI - CRM在预测那些相互作用了解甚少的基因的CRM时将特别有用。