Diener Stephen E, Houfek Thomas D, Kalat Sam E, Windham D E, Burke Mark, Opperman Charles, Dean Ralph A
Fungal Genomics Laboratory, Center for Integrated Fungal Research, North Carolina State University, Raleigh, NC 27695, USA.
BMC Bioinformatics. 2005 Jun 15;6:147. doi: 10.1186/1471-2105-6-147.
Sequencing of EST and BAC end datasets is no longer limited to large research groups. Drops in per-base pricing have made high throughput sequencing accessible to individual investigators. However, there are few options available which provide a free and user-friendly solution to the BLAST result storage and data mining needs of biologists.
Here we describe NuclearBLAST, a batch BLAST analysis, storage and management system designed for the biologist. It is a wrapper for NCBI BLAST which provides a user-friendly web interface which includes a request wizard and the ability to view and mine the results. All BLAST results are stored in a MySQL database which allows for more advanced data-mining through supplied command-line utilities or direct database access. NuclearBLAST can be installed on a single machine or clustered amongst a number of machines to improve analysis throughput. NuclearBLAST provides a platform which eases data-mining of multiple BLAST results. With the supplied scripts, the program can export data into a spreadsheet-friendly format, automatically assign Gene Ontology terms to sequences and provide bi-directional best hits between two datasets. Users with SQL experience can use the database to ask even more complex questions and extract any subset of data they require.
This tool provides a user-friendly interface for requesting, viewing and mining of BLAST results which makes the management and data-mining of large sets of BLAST analyses tractable to biologists.
EST和BAC末端数据集的测序不再局限于大型研究团队。每碱基定价的降低使得个体研究人员也能够进行高通量测序。然而,几乎没有可供选择的方案能为生物学家的BLAST结果存储和数据挖掘需求提供免费且用户友好的解决方案。
在此我们描述NuclearBLAST,一个为生物学家设计的批量BLAST分析、存储和管理系统。它是NCBI BLAST的一个包装程序,提供一个用户友好的网络界面,其中包括一个请求向导以及查看和挖掘结果的功能。所有BLAST结果都存储在一个MySQL数据库中,这允许通过提供的命令行实用程序或直接数据库访问进行更高级的数据挖掘。NuclearBLAST可以安装在单台机器上,也可以在多台机器之间集群化以提高分析通量。NuclearBLAST提供了一个便于对多个BLAST结果进行数据挖掘的平台。使用提供的脚本,该程序可以将数据导出为便于电子表格处理的格式,自动为序列分配基因本体术语,并在两个数据集之间提供双向最佳匹配。有SQL经验的用户可以使用该数据库提出更复杂的问题并提取他们所需的任何数据子集。
该工具为请求、查看和挖掘BLAST结果提供了一个用户友好的界面,使得对大量BLAST分析的管理和数据挖掘对于生物学家来说易于处理。