Chapman Michael A, Donaldson Ian J, Gilbert James, Grafham Darren, Rogers Jane, Green Anthony R, Göttgens Berthold
Cambridge Institute for Medical Research, Cambridge, CB2 2XY, UK.
Genome Res. 2004 Feb;14(2):313-8. doi: 10.1101/gr.1759004. Epub 2004 Jan 12.
Comparative analysis of genomic sequences is becoming a standard technique for studying gene regulation. However, only a limited number of tools are currently available for the analysis of multiple genomic sequences. An extensive data set for the testing and training of such tools is provided by the SCL gene locus. Here we have expanded the data set to eight vertebrate species by sequencing the dog SCL locus and by annotating the dog and rat SCL loci. To provide a resource for the bioinformatics community, all SCL sequences and functional annotations, comprising a collation of the extensive experimental evidence pertaining to SCL regulation, have been made available via a Web server. A Web interface to new tools specifically designed for the display and analysis of multiple sequence alignments was also implemented. The unique SCL data set and new sequence comparison tools allowed us to perform a rigorous examination of the true benefits of multiple sequence comparisons. We demonstrate that multiple sequence alignments are, overall, superior to pairwise alignments for identification of mammalian regulatory regions. In the search for individual transcription factor binding sites, multiple alignments markedly increase the signal-to-noise ratio compared to pairwise alignments.
基因组序列的比较分析正成为研究基因调控的标准技术。然而,目前可用于分析多个基因组序列的工具数量有限。SCL基因座提供了一个用于此类工具测试和训练的广泛数据集。在这里,我们通过对犬类SCL基因座进行测序以及对犬类和大鼠SCL基因座进行注释,将数据集扩展到了八个脊椎动物物种。为了为生物信息学社区提供资源,所有SCL序列和功能注释,包括与SCL调控相关的大量实验证据的整理,都已通过网络服务器提供。还实现了一个专门用于显示和分析多序列比对的新工具的网络界面。独特的SCL数据集和新的序列比较工具使我们能够对多序列比较的真正优势进行严格检验。我们证明,总体而言,多序列比对在识别哺乳动物调控区域方面优于双序列比对。在寻找单个转录因子结合位点时,与双序列比对相比,多序列比对显著提高了信噪比。