Dubois Siegfried, Zytnicki Matthias, Lemaitre Claire, Faraut Thomas
Univ Rennes, CNRS, Inria, IRISA-UMR 6074, F-35000, Rennes, France.
GenPhySE, Université de Toulouse, INRAE, ENVT, 31320, Castanet-Tolosan, France.
Bioinformatics. 2025 May 9. doi: 10.1093/bioinformatics/btaf291.
Pangenome variation graphs are an increasingly used tool to perform genome analysis, aiming to replace a linear reference in a wide variety of genomic analyses. The construction of a variation graph from a collection of chromosome-size genome sequences is a difficult task that is generally addressed using a number of heuristics. The question that arises is to what extent the construction method influences the resulting graph, and the characterization of variability.
We aim to characterize the differences between variation graphs derived from the same set of genomes with a metric which expresses and pinpoint differences. We designed a pairwise variation graph comparison algorithm, which establishes an edit distance between variation graphs, threading the genomes through both graphs. We applied our method to pangenome graphs built from yeast and human chromosome collections, and demonstrate that our method effectively characterizes discordances between pangenome graph construction methods and scales to real datasets.
pancat compare is published as free Rust software under the AGPL3.0 open source license. Source code and documentation are available at https://github.com/dubssieg/rs-pancat-compare. Snapshot available on Software Heritage at swh:1:dir:61acda8ba3dac1709ed60530147d3871831be629.
Supplementary data are available online at https://doi.org/10.5281/zenodo.10932489. Code to replicate figures and analysis is available online at https://github.com/dubssieg/pancat_paper.
泛基因组变异图是一种越来越常用的基因组分析工具,旨在在各种基因组分析中取代线性参考。从一组染色体大小的基因组序列构建变异图是一项艰巨的任务,通常使用多种启发式方法来解决。由此产生的问题是构建方法在多大程度上影响最终的图以及变异的特征。
我们旨在用一种能够表达并精确指出差异的度量来表征源自同一组基因组的变异图之间的差异。我们设计了一种成对变异图比较算法,该算法通过在两个图中贯穿基因组来建立变异图之间的编辑距离。我们将我们的方法应用于由酵母和人类染色体集合构建的泛基因组图,并证明我们的方法有效地表征了泛基因组图构建方法之间的不一致性,并且能够扩展到实际数据集。
pancat compare作为免费的Rust软件,根据AGPL3.0开源许可发布。源代码和文档可在https://github.com/dubssieg/rs-pancat-compare获取。软件遗产(Software Heritage)上的快照可在swh:1:dir:61acda8ba3dac1709ed60530147d3871831be629获取。