Kang Keunsoo, Chung Jae Hoon, Kim Joomyeong
Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA.
Nucleic Acids Res. 2009 Apr;37(6):2003-13. doi: 10.1093/nar/gkp077. Epub 2009 Feb 10.
We have developed a new bioinformatics approach called ECMFinder (Evolutionary Conserved Motif Finder). This program searches for a given DNA motif within the entire genome of one species and uses the gene association information of a potential transcription factor-binding site (TFBS) to screen the homologous regions of a second and third species. If multiple species have this potential TFBS in homologous positions, this program recognizes the identified TFBS as an evolutionary conserved motif (ECM). This program outputs a list of ECMs, which can be uploaded as a Custom Track in the UCSC genome browser and can be visualized along with other available data. The feasibility of this approach was tested by searching the genomes of three mammals (human, mouse and cow) with the DNA-binding motifs of YY1 and CTCF. This program successfully identified many clustered YY1- and CTCF-binding sites that are conserved among these species but were previously undetected. In particular, this program identified CTCF-binding sites that are located close to the Dlk1, Magel2 and Cdkn1c imprinted genes. Individual ChIP experiments confirmed the in vivo binding of the YY1 and CTCF proteins to most of these newly discovered binding sites, demonstrating the feasibility and usefulness of ECMFinder.
我们开发了一种名为ECMFinder(进化保守基序查找器)的新生物信息学方法。该程序在一个物种的整个基因组中搜索给定的DNA基序,并利用潜在转录因子结合位点(TFBS)的基因关联信息来筛选第二和第三个物种的同源区域。如果多个物种在同源位置具有这种潜在的TFBS,该程序就将识别出的TFBS视为进化保守基序(ECM)。该程序输出一个ECM列表,可作为自定义轨迹上传到UCSC基因组浏览器中,并可与其他可用数据一起可视化。通过用YY1和CTCF的DNA结合基序搜索三种哺乳动物(人类、小鼠和牛)的基因组来测试这种方法的可行性。该程序成功识别出许多在这些物种中保守但以前未被检测到的成簇YY1和CTCF结合位点。特别是,该程序识别出了位于Dlk1、Magel2和Cdkn1c印记基因附近的CTCF结合位点。单独的染色质免疫沉淀实验证实了YY1和CTCF蛋白在体内与大多数这些新发现的结合位点的结合,证明了ECMFinder的可行性和实用性。