Furió-Tarí Pedro, Conesa Ana, Tarazona Sonia
Genomics of Gene Expression Laboratory, Gene Expression and Epigenomics Program, Centro de Investigación Príncipe Felipe, Eduardo Primo Yúfera 3, 46012, Valencia, Spain.
Microbiology and Cell Science Department, Institute of Food and Agricultural Sciences, University of Florida, Gainesville, FL, 32603, USA.
BMC Bioinformatics. 2016 Nov 22;17(Suppl 15):427. doi: 10.1186/s12859-016-1293-1.
The integrative analysis of multiple genomics data often requires that genome coordinates-based signals have to be associated with proximal genes. The relative location of a genomic region with respect to the gene (gene area) is important for functional data interpretation; hence algorithms that match regions to genes should be able to deliver insight into this information.
In this work we review the tools that are publicly available for making region-to-gene associations. We also present a novel method, RGmatch, a flexible and easy-to-use Python tool that computes associations either at the gene, transcript, or exon level, applying a set of rules to annotate each region-gene association with the region location within the gene. RGmatch can be applied to any organism as long as genome annotation is available. Furthermore, we qualitatively and quantitatively compare RGmatch to other tools.
RGmatch simplifies the association of a genomic region with its closest gene. At the same time, it is a powerful tool because the rules used to annotate these associations are very easy to modify according to the researcher's specific interests. Some important differences between RGmatch and other similar tools already in existence are RGmatch's flexibility, its wide range of user options, compatibility with any annotatable organism, and its comprehensive and user-friendly output.
多个基因组数据的综合分析通常要求基于基因组坐标的信号必须与近端基因相关联。基因组区域相对于基因的相对位置(基因区域)对于功能数据解释很重要;因此,将区域与基因匹配的算法应该能够提供对该信息的深入了解。
在这项工作中,我们回顾了可公开获取的用于进行区域到基因关联的工具。我们还提出了一种新方法RGmatch,这是一个灵活且易于使用的Python工具,它可以在基因、转录本或外显子水平上计算关联,并应用一组规则用基因内的区域位置注释每个区域-基因关联。只要有基因组注释,RGmatch就可以应用于任何生物体。此外,我们对RGmatch与其他工具进行了定性和定量比较。
RGmatch简化了基因组区域与其最接近基因的关联。同时,它是一个强大的工具,因为用于注释这些关联的规则非常易于根据研究人员的特定兴趣进行修改。RGmatch与其他现有类似工具之间的一些重要区别在于RGmatch的灵活性、广泛的用户选项、与任何可注释生物体的兼容性以及其全面且用户友好的输出。