Centre of Reproductive Medicine and Andrology, University of Münster, Albert-Schweitzer-Campus 1 Building D11, 48149, Munster, Germany.
Institute of Medical Informatics, University of Münster, Albert-Schweitzer-Campus 1 Building A11, 48149, Munster, Germany.
BMC Bioinformatics. 2023 Jul 26;24(1):300. doi: 10.1186/s12859-023-05422-w.
Modern genome sequencing leads to an ever-growing collection of genomic annotations. Combining these elements with a set of input regions (e.g. genes) would yield new insights in genomic associations, such as those involved in gene regulation. The required data are scattered across different databases making a manual approach tiresome, unpractical, and prone to error. Semi-automatic approaches require programming skills in data parsing, processing, overlap calculation, and visualization, which most biomedical researchers lack. Our aim was to develop an automated tool providing all necessary algorithms, benefiting both bioinformaticians and researchers without bioinformatic training.
We developed overlapping annotated genomic regions (OGRE) as a comprehensive tool to associate and visualize input regions with genomic annotations. It does so by parsing regions of interest, mining publicly available annotations, and calculating possible overlaps between them. The user can thus identify location, type, and number of associated regulatory elements. Results are presented as easy to understand visualizations and result tables. We applied OGRE to recent studies and could show high reproducibility and potential new insights. To demonstrate OGRE's performance in terms of running time and output, we have conducted a benchmark and compared its features with similar tools.
OGRE's functions and built-in annotations can be applied as a downstream overlap association step, which is compatible with most genomic sequencing outputs, and can thus enrich pre-existing analyses pipelines. Compared to similar tools, OGRE shows competitive performance, offers additional features, and has been successfully applied to two recent studies. Overall, OGRE addresses the lack of tools for automatic analysis, local genomic overlap calculation, and visualization by providing an easy to use, end-to-end solution for both biologists and computational scientists.
现代基因组测序产生了越来越多的基因组注释。将这些元素与一组输入区域(例如基因)相结合,将产生新的基因组关联见解,例如参与基因调控的那些见解。所需的数据分散在不同的数据库中,使得手动方法繁琐、不切实际且容易出错。半自动方法需要在数据解析、处理、重叠计算和可视化方面具有编程技能,而大多数生物医学研究人员都缺乏这些技能。我们的目标是开发一种自动化工具,提供所有必要的算法,使生物信息学家和没有生物信息学培训的研究人员都受益。
我们开发了重叠注释基因组区域(OGRE),作为一种综合工具,用于关联和可视化输入区域与基因组注释。它通过解析感兴趣的区域、挖掘公开可用的注释并计算它们之间的可能重叠来实现这一点。用户可以识别相关的调节元件的位置、类型和数量。结果以易于理解的可视化和结果表呈现。我们将 OGRE 应用于最近的研究中,能够显示出高度的可重复性和潜在的新见解。为了展示 OGRE 在运行时间和输出方面的性能,我们进行了基准测试,并将其功能与类似工具进行了比较。
OGRE 的功能和内置注释可以作为下游重叠关联步骤应用,与大多数基因组测序输出兼容,因此可以丰富现有的分析管道。与类似工具相比,OGRE 具有竞争力的性能,提供了额外的功能,并已成功应用于最近的两项研究。总的来说,OGRE 通过提供一种易于使用的端到端解决方案,解决了缺乏自动分析、本地基因组重叠计算和可视化工具的问题,为生物学家和计算科学家都提供了帮助。