Cer Regina Z, Bruce Kevin H, Mudunuri Uma S, Yi Ming, Volfovsky Natalia, Luke Brian T, Bacolla Albino, Collins Jack R, Stephens Robert M
Advanced Biomedical Computing Center, Information Systems Program, SAIC-Frederick, Inc, NCI-Frederick, Frederick, MD 21702, USA.
Nucleic Acids Res. 2011 Jan;39(Database issue):D383-91. doi: 10.1093/nar/gkq1170. Epub 2010 Nov 21.
Although the capability of DNA to form a variety of non-canonical (non-B) structures has long been recognized, the overall significance of these alternate conformations in biology has only recently become accepted en masse. In order to provide access to genome-wide locations of these classes of predicted structures, we have developed non-B DB, a database integrating annotations and analysis of non-B DNA-forming sequence motifs. The database provides the most complete list of alternative DNA structure predictions available, including Z-DNA motifs, quadruplex-forming motifs, inverted repeats, mirror repeats and direct repeats and their associated subsets of cruciforms, triplex and slipped structures, respectively. The database also contains motifs predicted to form static DNA bends, short tandem repeats and homo(purine•pyrimidine) tracts that have been associated with disease. The database has been built using the latest releases of the human, chimp, dog, macaque and mouse genomes, so that the results can be compared directly with other data sources. In order to make the data interpretable in a genomic context, features such as genes, single-nucleotide polymorphisms and repetitive elements (SINE, LINE, etc.) have also been incorporated. The database is accessed through query pages that produce results with links to the UCSC browser and a GBrowse-based genomic viewer. It is freely accessible at http://nonb.abcc.ncifcrf.gov.
尽管DNA形成多种非经典(非B型)结构的能力早已为人所知,但这些替代构象在生物学中的整体重要性直到最近才被广泛接受。为了能够获取这些预测结构在全基因组中的位置,我们开发了非B型DNA数据库(non-B DB),这是一个整合了对形成非B型DNA的序列基序进行注释和分析的数据库。该数据库提供了现有的最完整的替代DNA结构预测列表,分别包括Z-DNA基序、形成四链体的基序、反向重复序列、镜像重复序列和直接重复序列及其相关的十字形、三链体和滑移结构子集。该数据库还包含预测会形成静态DNA弯曲、短串联重复序列以及与疾病相关的同型(嘌呤•嘧啶)序列。该数据库是使用人类、黑猩猩、狗、猕猴和小鼠基因组的最新版本构建的,以便结果能够直接与其他数据源进行比较。为了使数据在基因组背景下具有可解释性,还纳入了基因、单核苷酸多态性和重复元件(SINE、LINE等)等特征。通过查询页面访问该数据库,查询结果带有指向UCSC浏览器和基于GBrowse的基因组浏览器的链接。可通过http://nonb.abcc.ncifcrf.gov免费访问该数据库。