Suppr超能文献

BLASTGrabber:一种用于大规模BLAST数据可视化、分析和序列选择的生物信息学工具。

BLASTGrabber: a bioinformatic tool for visualization, analysis and sequence selection of massive BLAST data.

作者信息

Neumann Ralf Stefan, Kumar Surendra, Haverkamp Thomas Hendricus Augustus, Shalchian-Tabrizi Kamran

机构信息

Section for Genetics and Evolutionary Biology (EVOGENE) and Centre for Epigenetics, Development and Evolution (CEDE), University of Oslo, Oslo, Norway.

出版信息

BMC Bioinformatics. 2014 May 5;15:128. doi: 10.1186/1471-2105-15-128.

Abstract

BACKGROUND

Advances in sequencing efficiency have vastly increased the sizes of biological sequence databases, including many thousands of genome-sequenced species. The BLAST algorithm remains the main search engine for retrieving sequence information, and must consequently handle data on an unprecedented scale. This has been possible due to high-performance computers and parallel processing. However, the raw BLAST output from contemporary searches involving thousands of queries becomes ill-suited for direct human processing. Few programs attempt to directly visualize and interpret BLAST output; those that do often provide a mere basic structuring of BLAST data.

RESULTS

Here we present a bioinformatics application named BLASTGrabber suitable for high-throughput sequencing analysis. BLASTGrabber, being implemented as a Java application, is OS-independent and includes a user friendly graphical user interface. Text or XML-formatted BLAST output files can be directly imported, displayed and categorized based on BLAST statistics. Query names and FASTA headers can be analysed by text-mining. In addition to visualizing sequence alignments, BLAST data can be ordered as an interactive taxonomy tree. All modes of analysis support selection, export and storage of data. A Java interface-based plugin structure facilitates the addition of customized third party functionality.

CONCLUSION

The BLASTGrabber application introduces new ways of visualizing and analysing massive BLAST output data by integrating taxonomy identification, text mining capabilities and generic multi-dimensional rendering of BLAST hits. The program aims at a non-expert audience in terms of computer skills; the combination of new functionalities makes the program flexible and useful for a broad range of operations.

摘要

背景

测序效率的提高极大地增加了生物序列数据库的规模,其中包括数千个已进行基因组测序的物种。BLAST算法仍然是检索序列信息的主要搜索引擎,因此必须处理前所未有的规模的数据。这得益于高性能计算机和并行处理技术才得以实现。然而,当代涉及数千个查询的搜索产生的原始BLAST输出变得不适合直接供人处理。很少有程序尝试直接可视化和解释BLAST输出;那些这样做的程序通常只是对BLAST数据进行基本的结构化处理。

结果

在此,我们展示了一个名为BLASTGrabber的生物信息学应用程序,适用于高通量测序分析。BLASTGrabber作为一个Java应用程序实现,与操作系统无关,并包含一个用户友好的图形用户界面。文本或XML格式的BLAST输出文件可以直接导入、显示,并根据BLAST统计数据进行分类。查询名称和FASTA标题可以通过文本挖掘进行分析。除了可视化序列比对之外,BLAST数据还可以整理成交互式分类树。所有分析模式都支持数据的选择、导出和存储。基于Java接口的插件结构便于添加定制的第三方功能。

结论

BLASTGrabber应用程序通过整合分类识别、文本挖掘功能以及对BLAST命中结果的通用多维呈现,引入了可视化和分析大量BLAST输出数据的新方法。该程序针对计算机技能方面的非专业受众;新功能的组合使该程序灵活且适用于广泛的操作。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验