Department of Zoology, University of Oxford, UK.
BMC Bioinformatics. 2010 Dec 10;11:595. doi: 10.1186/1471-2105-11-595.
The opportunities for bacterial population genomics that are being realised by the application of parallel nucleotide sequencing require novel bioinformatics platforms. These must be capable of the storage, retrieval, and analysis of linked phenotypic and genotypic information in an accessible, scalable and computationally efficient manner.
The Bacterial Isolate Genome Sequence Database (BIGSDB) is a scalable, open source, web-accessible database system that meets these needs, enabling phenotype and sequence data, which can range from a single sequence read to whole genome data, to be efficiently linked for a limitless number of bacterial specimens. The system builds on the widely used mlstdbNet software, developed for the storage and distribution of multilocus sequence typing (MLST) data, and incorporates the capacity to define and identify any number of loci and genetic variants at those loci within the stored nucleotide sequences. These loci can be further organised into 'schemes' for isolate characterisation or for evolutionary or functional analyses. Isolates and loci can be indexed by multiple names and any number of alternative schemes can be accommodated, enabling cross-referencing of different studies and approaches. LIMS functionality of the software enables linkage to and organisation of laboratory samples. The data are easily linked to external databases and fine-grained authentication of access permits multiple users to participate in community annotation by setting up or contributing to different schemes within the database. Some of the applications of BIGSDB are illustrated with the genera Neisseria and Streptococcus.The BIGSDB source code and documentation are available at http://pubmlst.org/software/database/bigsdb/.
Genomic data can be used to characterise bacterial isolates in many different ways but it can also be efficiently exploited for evolutionary or functional studies. BIGSDB represents a freely available resource that will assist the broader community in the elucidation of the structure and function of bacteria by means of a population genomics approach.
通过并行核苷酸测序的应用,细菌群体基因组学的机会需要新的生物信息学平台。这些平台必须能够以可访问、可扩展和计算高效的方式存储、检索和分析相关的表型和基因型信息。
细菌分离物基因组序列数据库(BIGSDB)是一个可扩展的、开源的、网络可访问的数据库系统,满足这些需求,使表型和序列数据能够高效地链接,这些数据范围从单个序列读取到全基因组数据,可以为无限数量的细菌标本提供服务。该系统建立在广泛使用的 mlstdbNet 软件之上,该软件用于存储和分发多位点序列分型(MLST)数据,并具有在存储的核苷酸序列中定义和识别任意数量的基因座和基因变异的能力。这些基因座可以进一步组织到“方案”中,用于分离物特征描述或进化或功能分析。分离物和基因座可以用多个名称索引,并且可以容纳任意数量的替代方案,从而能够交叉引用不同的研究和方法。该软件的 LIMS 功能使实验室样本的链接和组织成为可能。数据可以轻松链接到外部数据库,并且可以通过设置或为数据库中的不同方案做出贡献,允许多个用户通过精细的访问认证参与社区注释。BIGSDB 的一些应用通过淋病奈瑟菌和链球菌属来说明。BIGSDB 的源代码和文档可在 http://pubmlst.org/software/database/bigsdb/ 获得。
基因组数据可用于以多种不同方式描述细菌分离物,但也可有效地用于进化或功能研究。BIGSDB 是一个免费提供的资源,将通过群体基因组学方法协助更广泛的社区阐明细菌的结构和功能。