Cavalcante Raymond G, Sartor Maureen A
Department of Computational Medicine and Bioinformatics.
Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.
Bioinformatics. 2017 Aug 1;33(15):2381-2383. doi: 10.1093/bioinformatics/btx183.
Analysis of next-generation sequencing data often results in a list of genomic regions. These may include differentially methylated CpGs/regions, transcription factor binding sites, interacting chromatin regions, or GWAS-associated SNPs, among others. A common analysis step is to annotate such genomic regions to genomic annotations (promoters, exons, enhancers, etc.). Existing tools are limited by a lack of annotation sources and flexible options, the time it takes to annotate regions, an artificial one-to-one region-to-annotation mapping, a lack of visualization options to easily summarize data, or some combination thereof.
We developed the annotatr Bioconductor package to flexibly and quickly summarize and plot annotations of genomic regions. The annotatr package reports all intersections of regions and annotations, giving a better understanding of the genomic context of the regions. A variety of graphics functions are implemented to easily plot numerical or categorical data associated with the regions across the annotations, and across annotation intersections, providing insight into how characteristics of the regions differ across the annotations. We demonstrate that annotatr is up to 27× faster than comparable R packages. Overall, annotatr enables a richer biological interpretation of experiments.
http://bioconductor.org/packages/annotatr/ and https://github.com/rcavalcante/annotatr.
Supplementary data are available at Bioinformatics online.
对下一代测序数据的分析通常会产生一份基因组区域列表。这些区域可能包括差异甲基化的CpG/区域、转录因子结合位点、相互作用的染色质区域或GWAS相关的单核苷酸多态性等。一个常见的分析步骤是将这些基因组区域注释到基因组注释(启动子、外显子、增强子等)中。现有工具存在局限性,比如注释来源和灵活选项不足、注释区域耗时较长、区域与注释的人工一对一映射、缺乏便于汇总数据的可视化选项,或者是这些因素的某种组合。
我们开发了annotatr Bioconductor软件包,用于灵活、快速地汇总和绘制基因组区域的注释。annotatr软件包报告区域与注释的所有交集,从而更好地理解这些区域的基因组背景。实现了多种图形函数,以便轻松绘制与跨注释以及跨注释交集的区域相关的数值或分类数据,深入了解区域特征在不同注释之间的差异。我们证明annotatr比同类R软件包快27倍。总体而言,annotatr能够对实验进行更丰富的生物学解释。
http://bioconductor.org/packages/annotatr/ 以及https://github.com/rcavalcante/annotatr。
补充数据可在《生物信息学》在线获取。