BLASTGrabber：一种用于大规模BLAST数据可视化、分析和序列选择的生物信息学工具。

BLASTGrabber: a bioinformatic tool for visualization, analysis and sequence selection of massive BLAST data.

作者信息

Neumann Ralf Stefan, Kumar Surendra, Haverkamp Thomas Hendricus Augustus, Shalchian-Tabrizi Kamran

机构信息

Section for Genetics and Evolutionary Biology (EVOGENE) and Centre for Epigenetics, Development and Evolution (CEDE), University of Oslo, Oslo, Norway.

出版信息

BMC Bioinformatics. 2014 May 5;15:128. doi: 10.1186/1471-2105-15-128.

DOI:10.1186/1471-2105-15-128

PMID:24885091

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4062517/

Abstract

BACKGROUND

Advances in sequencing efficiency have vastly increased the sizes of biological sequence databases, including many thousands of genome-sequenced species. The BLAST algorithm remains the main search engine for retrieving sequence information, and must consequently handle data on an unprecedented scale. This has been possible due to high-performance computers and parallel processing. However, the raw BLAST output from contemporary searches involving thousands of queries becomes ill-suited for direct human processing. Few programs attempt to directly visualize and interpret BLAST output; those that do often provide a mere basic structuring of BLAST data.

RESULTS

Here we present a bioinformatics application named BLASTGrabber suitable for high-throughput sequencing analysis. BLASTGrabber, being implemented as a Java application, is OS-independent and includes a user friendly graphical user interface. Text or XML-formatted BLAST output files can be directly imported, displayed and categorized based on BLAST statistics. Query names and FASTA headers can be analysed by text-mining. In addition to visualizing sequence alignments, BLAST data can be ordered as an interactive taxonomy tree. All modes of analysis support selection, export and storage of data. A Java interface-based plugin structure facilitates the addition of customized third party functionality.

CONCLUSION

The BLASTGrabber application introduces new ways of visualizing and analysing massive BLAST output data by integrating taxonomy identification, text mining capabilities and generic multi-dimensional rendering of BLAST hits. The program aims at a non-expert audience in terms of computer skills; the combination of new functionalities makes the program flexible and useful for a broad range of operations.

摘要

背景

测序效率的提高极大地增加了生物序列数据库的规模，其中包括数千个已进行基因组测序的物种。BLAST算法仍然是检索序列信息的主要搜索引擎，因此必须处理前所未有的规模的数据。这得益于高性能计算机和并行处理技术才得以实现。然而，当代涉及数千个查询的搜索产生的原始BLAST输出变得不适合直接供人处理。很少有程序尝试直接可视化和解释BLAST输出；那些这样做的程序通常只是对BLAST数据进行基本的结构化处理。

结果

在此，我们展示了一个名为BLASTGrabber的生物信息学应用程序，适用于高通量测序分析。BLASTGrabber作为一个Java应用程序实现，与操作系统无关，并包含一个用户友好的图形用户界面。文本或XML格式的BLAST输出文件可以直接导入、显示，并根据BLAST统计数据进行分类。查询名称和FASTA标题可以通过文本挖掘进行分析。除了可视化序列比对之外，BLAST数据还可以整理成交互式分类树。所有分析模式都支持数据的选择、导出和存储。基于Java接口的插件结构便于添加定制的第三方功能。

结论

BLASTGrabber应用程序通过整合分类识别、文本挖掘功能以及对BLAST命中结果的通用多维呈现，引入了可视化和分析大量BLAST输出数据的新方法。该程序针对计算机技能方面的非专业受众；新功能的组合使该程序灵活且适用于广泛的操作。

相似文献

BLASTGrabber: a bioinformatic tool for visualization, analysis and sequence selection of massive BLAST data.

BMC Bioinformatics. 2014 May 5;15:128. doi: 10.1186/1471-2105-15-128.

Windows .NET Network Distributed Basic Local Alignment Search Toolkit (W.ND-BLAST).

BMC Bioinformatics. 2005 Apr 8;6:93. doi: 10.1186/1471-2105-6-93.

SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters.

BMC Bioinformatics. 2004 Oct 28;5:171. doi: 10.1186/1471-2105-5-171.

Visual BLAST and visual FASTA: graphic workbenches for interactive analysis of full BLAST and FASTA outputs under MICROSOFT WINDOWS 95/NT.

Comput Appl Biosci. 1997 Aug;13(4):407-13. doi: 10.1093/bioinformatics/13.4.407.

Recent Hits Acquired by BLAST (ReHAB): a tool to identify new hits in sequence similarity searches.

BMC Bioinformatics. 2005 Feb 8;6:23. doi: 10.1186/1471-2105-6-23.

Massively Parallel Implementation of Sequence Alignment with Basic Local Alignment Search Tool Using Parallel Computing in Java Library.

J Comput Biol. 2018 Aug;25(8):871-881. doi: 10.1089/cmb.2018.0079. Epub 2018 Jul 13.

BOV--a web-based BLAST output visualization tool.

BMC Genomics. 2008 Sep 15;9:414. doi: 10.1186/1471-2164-9-414.

BLAST output visualization in the new sequencing era.

Brief Bioinform. 2014 Jul;15(4):484-503. doi: 10.1093/bib/bbt009.

BioParser: a tool for processing of sequence similarity analysis reports.

Appl Bioinformatics. 2006;5(1):49-53. doi: 10.2165/00822942-200605010-00007.

muBLASTP: database-indexed protein sequence search on multicore CPUs.

BMC Bioinformatics. 2016 Nov 4;17(1):443. doi: 10.1186/s12859-016-1302-4.

引用本文的文献

ProtAlign-ARG: antibiotic resistance gene characterization integrating protein language models and alignment-based scoring.

Sci Rep. 2025 Aug 18;15(1):30174. doi: 10.1038/s41598-025-14545-4.

Proteome-wide analysis of Coxiella burnetii for conserved T-cell epitopes with presentation across multiple host species.

BMC Bioinformatics. 2021 Jun 2;22(1):296. doi: 10.1186/s12859-021-04181-w.

BlasterJS: A novel interactive JavaScript visualisation component for BLAST alignment results.

PLoS One. 2018 Oct 9;13(10):e0205286. doi: 10.1371/journal.pone.0205286. eCollection 2018.

BLAST-XYPlot Viewer: A Tool for Performing BLAST in Whole-Genome Sequenced Bacteria/Archaea and Visualize Whole Results Simultaneously.

G3 (Bethesda). 2018 Jul 2;8(7):2167-2172. doi: 10.1534/g3.118.200220.

CABRA: Cluster and Annotate Blast Results Algorithm.

BMC Res Notes. 2016 Apr 30;9:253. doi: 10.1186/s13104-016-2062-y.

Regulatory RNA at the root of animals: dynamic expression of developmental lincRNAs in the calcisponge Sycon ciliatum.

Proc Biol Sci. 2015 Dec 22;282(1821):20151746. doi: 10.1098/rspb.2015.1746.

本文引用的文献

Collodictyon--an ancient lineage in the tree of eukaryotes.

Mol Biol Evol. 2012 Jun;29(6):1557-68. doi: 10.1093/molbev/mss001. Epub 2012 Jan 6.

Hive plots--rational approach to visualizing networks.

Brief Bioinform. 2012 Sep;13(5):627-44. doi: 10.1093/bib/bbr069. Epub 2011 Dec 9.

genBlastG: using BLAST searches to build homologous gene models.

Bioinformatics. 2011 Aug 1;27(15):2141-3. doi: 10.1093/bioinformatics/btr342. Epub 2011 Jun 8.

Improving taxonomy-based protein fold recognition by using global and local features.

Proteins. 2011 Jul;79(7):2053-64. doi: 10.1002/prot.23025. Epub 2011 May 2.

Systems-wide temporal proteomic profiling in glucose-starved Bacillus subtilis.

Nat Commun. 2010;1:137. doi: 10.1038/ncomms1137.

Circoletto: visualizing sequence similarity with Circos.

Bioinformatics. 2010 Oct 15;26(20):2620-1. doi: 10.1093/bioinformatics/btq484. Epub 2010 Aug 24.

Genomic analysis of expressed sequence tags in American black bear Ursus americanus.

BMC Genomics. 2010 Mar 26;11:201. doi: 10.1186/1471-2164-11-201.

Visualizing biological data-now and in the future.

Nat Methods. 2010 Mar;7(3 Suppl):S2-4. doi: 10.1038/nmeth.f.301.

Visualization of multiple alignments, phylogenies and gene family evolution.

Nat Methods. 2010 Mar;7(3 Suppl):S16-25. doi: 10.1038/nmeth.1434.

BLAST-EXPLORER helps you building datasets for phylogenetic analysis.

BMC Evol Biol. 2010 Jan 12;10:8. doi: 10.1186/1471-2148-10-8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

BLASTGrabber：一种用于大规模BLAST数据可视化、分析和序列选择的生物信息学工具。

BLASTGrabber: a bioinformatic tool for visualization, analysis and sequence selection of massive BLAST data.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献