Bioinformatics and Computational Biology, Genentech, Inc., South San Francisco, California, United States of America.
PLoS Comput Biol. 2013;9(8):e1003118. doi: 10.1371/journal.pcbi.1003118. Epub 2013 Aug 8.
We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.
我们描述了 Bioconductor 基础设施,用于表示和计算带注释的基因组范围,并将基因组数据与 R 及其扩展的统计计算功能集成。该基础设施的核心是三个软件包:IRanges、GenomicRanges 和 GenomicFeatures。这些软件包提供了可扩展的数据结构,用于表示基因组上的注释范围,特别支持转录本结构、读取比对和覆盖向量。计算功能包括高效的重叠和最近邻检测算法、覆盖计算和其他范围操作。该基础设施直接支持 80 多个其他 Bioconductor 软件包,包括序列分析、差异表达分析和可视化软件包。