Suppr超能文献

人参属:快速准确的泛基因组增长及核心大小估计

Panacus: fast and exact pangenome growth and core size estimation.

作者信息

Parmigiani Luca, Garrison Erik, Stoye Jens, Marschall Tobias, Doerr Daniel

机构信息

Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld 33615, Germany.

Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, United States.

出版信息

Bioinformatics. 2024 Nov 28;40(12). doi: 10.1093/bioinformatics/btae720.

Abstract

MOTIVATION

Using a single linear reference genome poses a limitation to exploring the full genomic diversity of a species. The release of a draft human pangenome underscores the increasing relevance of pangenomics to overcome these limitations. Pangenomes are commonly represented as graphs, which can represent billions of base pairs of sequence. Presently, there is a lack of scalable software able to perform key tasks on pangenomes, such as quantifying universally shared sequence across genomes (the core genome) and measuring the extent of genomic variability as a function of sample size (pangenome growth).

RESULTS

We introduce Panacus (pangenome-abacus), a tool designed to rapidly perform these tasks and visualize the results in interactive plots. Panacus can process GFA files, the accepted standard for pangenome graphs, and is able to analyze a human pangenome graph with 110 million nodes in <1 h.

AVAILABILITY AND IMPLEMENTATION

Panacus is implemented in Rust and is published as Open Source software under the MIT license. The source code and documentation are available at https://github.com/marschall-lab/panacus. Panacus can be installed via Bioconda at https://bioconda.github.io/recipes/panacus/README.html.

摘要

动机

使用单一的线性参考基因组对探索物种的全基因组多样性存在限制。人类泛基因组草图的发布凸显了泛基因组学在克服这些限制方面日益重要的意义。泛基因组通常以图形表示,其可代表数十亿碱基对的序列。目前,缺乏能够对泛基因组执行关键任务的可扩展软件,例如量化跨基因组普遍共享的序列(核心基因组)以及测量作为样本量函数的基因组变异性程度(泛基因组增长)。

结果

我们引入了Panacus(泛基因组算盘),这是一种旨在快速执行这些任务并在交互式图中可视化结果的工具。Panacus可以处理GFA文件,这是泛基因组图的公认标准,并且能够在不到1小时的时间内分析具有1.1亿个节点的人类泛基因组图。

可用性和实现方式

Panacus用Rust实现,并根据MIT许可作为开源软件发布。源代码和文档可在https://github.com/marschall-lab/panacus获取。Panacus可以通过Bioconda在https://bioconda.github.io/recipes/panacus/README.html安装。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1697/11665632/17619b01f4fa/btae720f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验