Saccone Scott F, Quan Jiaxi, Mehta Gaurang, Bolze Raphael, Thomas Prasanth, Deelman Ewa, Tischfield Jay A, Rice John P
Department of Psychiatry, Washington University, University of Southern California, Washington University, USA.
Nucleic Acids Res. 2011 Jan;39(Database issue):D901-7. doi: 10.1093/nar/gkq1054. Epub 2010 Oct 30.
Genome-wide association studies often incorporate information from public biological databases in order to provide a biological reference for interpreting the results. The dbSNP database is an extensive source of information on single nucleotide polymorphisms (SNPs) for many different organisms, including humans. We have developed free software that will download and install a local MySQL implementation of the dbSNP relational database for a specified organism. We have also designed a system for classifying dbSNP tables in terms of common tasks we wish to accomplish using the database. For each task we have designed a small set of custom tables that facilitate task-related queries and provide entity-relationship diagrams for each task composed from the relevant dbSNP tables. In order to expose these concepts and methods to a wider audience we have developed web tools for querying the database and browsing documentation on the tables and columns to clarify the relevant relational structure. All web tools and software are freely available to the public at http://cgsmd.isi.edu/dbsnpq. Resources such as these for programmatically querying biological databases are essential for viably integrating biological information into genetic association experiments on a genome-wide scale.
全基因组关联研究通常会纳入来自公共生物数据库的信息,以便为解释研究结果提供生物学参考。dbSNP数据库是许多不同生物(包括人类)单核苷酸多态性(SNP)的广泛信息来源。我们开发了免费软件,可针对指定生物下载并安装dbSNP关系数据库的本地MySQL实现。我们还设计了一个系统,根据我们希望使用该数据库完成的常见任务对dbSNP表进行分类。对于每个任务,我们都设计了一小组自定义表,以方便与任务相关的查询,并为由相关dbSNP表组成的每个任务提供实体关系图。为了让更广泛的受众了解这些概念和方法,我们开发了网络工具,用于查询数据库以及浏览有关表和列的文档,以阐明相关的关系结构。所有网络工具和软件均可在http://cgsmd.isi.edu/dbsnpq上免费向公众提供。此类以编程方式查询生物数据库的资源对于在全基因组范围内将生物信息切实整合到基因关联实验中至关重要。