Genomics Research Centre, Human Technopole, Milan 20157, Italy.
Quantitative Biology Center (QBiC), University of Tübingen, Tübingen 72076, Germany.
Bioinformatics. 2022 Jun 27;38(13):3319-3326. doi: 10.1093/bioinformatics/btac308.
Pangenome graphs provide a complete representation of the mutual alignment of collections of genomes. These models offer the opportunity to study the entire genomic diversity of a population, including structurally complex regions. Nevertheless, analyzing hundreds of gigabase-scale genomes using pangenome graphs is difficult as it is not well-supported by existing tools. Hence, fast and versatile software is required to ask advanced questions to such data in an efficient way.
We wrote Optimized Dynamic Genome/Graph Implementation (ODGI), a novel suite of tools that implements scalable algorithms and has an efficient in-memory representation of DNA pangenome graphs in the form of variation graphs. ODGI supports pre-built graphs in the Graphical Fragment Assembly format. ODGI includes tools for detecting complex regions, extracting pangenomic loci, removing artifacts, exploratory analysis, manipulation, validation and visualization. Its fast parallel execution facilitates routine pangenomic tasks, as well as pipelines that can quickly answer complex biological questions of gigabase-scale pangenome graphs.
ODGI is published as free software under the MIT open source license. Source code can be downloaded from https://github.com/pangenome/odgi and documentation is available at https://odgi.readthedocs.io. ODGI can be installed via Bioconda https://bioconda.github.io/recipes/odgi/README.html or GNU Guix https://github.com/pangenome/odgi/blob/master/guix.scm.
Supplementary data are available at Bioinformatics online.
泛基因组图提供了基因组集合相互比对的完整表示。这些模型为研究群体的整个基因组多样性提供了机会,包括结构复杂的区域。然而,使用泛基因组图分析数百个千兆碱基规模的基因组是困难的,因为现有的工具并不能很好地支持它。因此,需要快速且多功能的软件来有效地对这些数据提出高级问题。
我们编写了 Optimized Dynamic Genome/Graph Implementation (ODGI),这是一套新的工具,它实现了可扩展的算法,并以变化图的形式对 DNA 泛基因组图进行了高效的内存表示。ODGI 支持以 Graphical Fragment Assembly 格式构建的预建图。ODGI 包括用于检测复杂区域、提取泛基因组基因座、去除伪影、探索性分析、操作、验证和可视化的工具。其快速的并行执行促进了常规泛基因组任务,以及可以快速回答千兆碱基规模泛基因组图复杂生物学问题的流水线。
ODGI 作为 MIT 开源许可证下的免费软件发布。源代码可从 https://github.com/pangenome/odgi 下载,文档可在 https://odgi.readthedocs.io 上查阅。ODGI 可通过 Bioconda https://bioconda.github.io/recipes/odgi/README.html 或 GNU Guix https://github.com/pangenome/odgi/blob/master/guix.scm 安装。
补充数据可在生物信息学在线获得。