Ovcharenko Ivan, Loots Gabriela G, Giardine Belinda M, Hou Minmei, Ma Jian, Hardison Ross C, Stubbs Lisa, Miller Webb
Energy, Environment, Biology and Institutional Computing, Lawrence Livermore National Laboratory, Livermore, California 94550, USA.
Genome Res. 2005 Jan;15(1):184-94. doi: 10.1101/gr.3007205. Epub 2004 Dec 8.
Multiple-sequence alignment analysis is a powerful approach for understanding phylogenetic relationships, annotating genes, and detecting functional regulatory elements. With a growing number of partly or fully sequenced vertebrate genomes, effective tools for performing multiple comparisons are required to accurately and efficiently assist biological discoveries. Here we introduce Mulan (http://mulan.dcode.org/), a novel method and a network server for comparing multiple draft and finished-quality sequences to identify functional elements conserved over evolutionary time. Mulan brings together several novel algorithms: the TBA multi-aligner program for rapid identification of local sequence conservation, and the multiTF program for detecting evolutionarily conserved transcription factor binding sites in multiple alignments. In addition, Mulan supports two-way communication with the GALA database; alignments of multiple species dynamically generated in GALA can be viewed in Mulan, and conserved transcription factor binding sites identified with Mulan/multiTF can be integrated and overlaid with extensive genome annotation data using GALA. Local multiple alignments computed by Mulan ensure reliable representation of short- and large-scale genomic rearrangements in distant organisms. Mulan allows for interactive modification of critical conservation parameters to differentially predict conserved regions in comparisons of both closely and distantly related species. We illustrate the uses and applications of the Mulan tool through multispecies comparisons of the GATA3 gene locus and the identification of elements that are conserved in a different way in avians than in other genomes, allowing speculation on the evolution of birds. Source code for the aligners and the aligner-evaluation software can be freely downloaded from http://www.bx.psu.edu/miller_lab/.
多序列比对分析是一种用于理解系统发育关系、注释基因和检测功能调控元件的强大方法。随着部分或完全测序的脊椎动物基因组数量不断增加,需要有效的工具来进行多重比较,以准确、高效地辅助生物学发现。在此,我们介绍Mulan(http://mulan.dcode.org/),这是一种用于比较多个草图序列和完成质量序列以识别在进化时间中保守的功能元件的新方法和网络服务器。Mulan整合了几种新算法:用于快速识别局部序列保守性的TBA多比对程序,以及用于在多序列比对中检测进化上保守的转录因子结合位点的multiTF程序。此外,Mulan支持与GALA数据库的双向通信;可以在Mulan中查看在GALA中动态生成的多个物种的比对,并且使用GALA可以将用Mulan/multiTF识别的保守转录因子结合位点与广泛的基因组注释数据进行整合和叠加。由Mulan计算的局部多序列比对确保了在远缘生物中可靠地表示短程和大规模基因组重排。Mulan允许交互式修改关键保守参数,以在密切相关和远缘相关物种的比较中差异预测保守区域。我们通过对GATA3基因座的多物种比较以及鉴定在鸟类中以与其他基因组不同的方式保守的元件来说明Mulan工具的用途和应用,从而有助于推测鸟类的进化。比对器和比对器评估软件的源代码可从http://www.bx.psu.edu/miller_lab/免费下载。