Novichkov Pavel S, Ratnere Igor, Wolf Yuri I, Koonin Eugene V, Dubchak Inna
Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA.
Nucleic Acids Res. 2009 Jan;37(Database issue):D448-54. doi: 10.1093/nar/gkn684. Epub 2008 Oct 9.
The database of Alignable Tight Genomic Clusters (ATGCs) consists of closely related genomes of archaea and bacteria, and is a resource for research into prokaryotic microevolution. Construction of a data set with appropriate characteristics is a major hurdle for this type of studies. With the current rate of genome sequencing, it is difficult to follow the progress of the field and to determine which of the available genome sets meet the requirements of a given research project, in particular, with respect to the minimum and maximum levels of similarity between the included genomes. Additionally, extraction of specific content, such as genomic alignments or families of orthologs, from a selected set of genomes is a complicated and time-consuming process. The database addresses these problems by providing an intuitive and efficient web interface to browse precomputed ATGCs, select appropriate ones and access ATGC-derived data such as multiple alignments of orthologous proteins, matrices of pairwise intergenomic distances based on genome-wide analysis of synonymous and nonsynonymous substitution rates and others. The ATGC database will be regularly updated following new releases of the NCBI RefSeq. The database is hosted by the Genomics Division at Lawrence Berkeley National laboratory and is publicly available at http://atgc.lbl.gov.
可比对紧密基因组簇(ATGCs)数据库包含古菌和细菌的密切相关基因组,是原核生物微进化研究的资源。构建具有适当特征的数据集是这类研究的主要障碍。以当前的基因组测序速度,很难跟上该领域的进展并确定哪些可用基因组集满足特定研究项目的要求,特别是在所包含基因组之间的最小和最大相似性水平方面。此外,从选定的基因组集中提取特定内容,如基因组比对或直系同源基因家族,是一个复杂且耗时的过程。该数据库通过提供直观且高效的网络界面来解决这些问题,以浏览预先计算的ATGCs、选择合适的ATGCs并访问ATGC衍生的数据,如直系同源蛋白质的多序列比对、基于全基因组同义与非同义替换率分析的基因组间成对距离矩阵等。ATGC数据库将随NCBI RefSeq的新版本定期更新。该数据库由劳伦斯伯克利国家实验室的基因组学部门托管,可在http://atgc.lbl.gov上公开获取。