Department of Biochemistry and Molecular Genetics, University of Virginia School of Medicine, Charlottesville, VA 22908, USA.
Bioinformatics. 2010 Mar 15;26(6):841-2. doi: 10.1093/bioinformatics/btq033. Epub 2010 Jan 28.
Testing for correlations between different sets of genomic features is a fundamental task in genomics research. However, searching for overlaps between features with existing web-based methods is complicated by the massive datasets that are routinely produced with current sequencing technologies. Fast and flexible tools are therefore required to ask complex questions of these data in an efficient manner.
This article introduces a new software suite for the comparison, manipulation and annotation of genomic features in Browser Extensible Data (BED) and General Feature Format (GFF) format. BEDTools also supports the comparison of sequence alignments in BAM format to both BED and GFF features. The tools are extremely efficient and allow the user to compare large datasets (e.g. next-generation sequencing data) with both public and custom genome annotation tracks. BEDTools can be combined with one another as well as with standard UNIX commands, thus facilitating routine genomics tasks as well as pipelines that can quickly answer intricate questions of large genomic datasets.
BEDTools was written in C++. Source code and a comprehensive user manual are freely available at http://code.google.com/p/bedtools
aaronquinlan@gmail.com; imh4y@virginia.edu
Supplementary data are available at Bioinformatics online.
测试不同基因组特征集之间的相关性是基因组学研究的基本任务。然而,通过现有的基于网络的方法搜索特征之间的重叠会受到当前测序技术常规产生的海量数据集的影响。因此,需要快速灵活的工具才能有效地对这些数据提出复杂的问题。
本文介绍了一个用于比较、操作和注释 Browser Extensible Data (BED) 和 General Feature Format (GFF) 格式中的基因组特征的新软件套件。BEDTools 还支持 BAM 格式的序列比对与 BED 和 GFF 特征的比较。这些工具非常高效,允许用户将大型数据集(例如下一代测序数据)与公共和自定义基因组注释轨道进行比较。BEDTools 可以彼此组合,也可以与标准 UNIX 命令组合,从而促进常规基因组学任务以及可以快速回答大型基因组数据集复杂问题的管道。
BEDTools 是用 C++编写的。源代码和全面的用户手册可在 http://code.google.com/p/bedtools 上免费获得。
aaronquinlan@gmail.com;imh4y@virginia.edu
补充数据可在 Bioinformatics 在线获得。