Jackson Chris, McLay Todd, Schmidt-Lebuhn Alexander N
Royal Botanic Gardens Victoria, Birdwood Avenue, Melbourne Victoria 3004 Australia.
Centre for Australian National Biodiversity Research CSIRO, Clunies Ross Street Canberra 2601 Australian Capital Territory Australia.
Appl Plant Sci. 2023 Jul 17;11(4):e11532. doi: 10.1002/aps3.11532. eCollection 2023 Jul-Aug.
The HybPiper pipeline has become one of the most widely used tools for the assembly of target capture data for phylogenomic analysis. After the production of locus sequences and before phylogenetic analysis, the identification of paralogs is a critical step for ensuring the accurate inference of evolutionary relationships. Algorithmic approaches using gene tree topologies for the inference of ortholog groups are computationally efficient and broadly applicable to non-model organisms, especially in the absence of a known species tree.
We containerized and expanded the functionality of both HybPiper and a pipeline for the inference of ortholog groups, providing novel options for the treatment of target capture sequence data, and allowing seamless use of the outputs of the former as inputs for the latter. The Singularity container presented here includes all dependencies, and the corresponding pipelines (hybpiper-nf and paragone-nf, respectively) are implemented via two Nextflow scripts for easier deployment and to vastly reduce the number of commands required for their use.
The hybpiper-nf and paragone-nf pipelines are easily installed and provide a user-friendly experience and robust results to the phylogenetic community. They are used by the Australian Angiosperm Tree of Life project. The pipelines are available at https://github.com/chrisjackson-pellicle/hybpiper-nf and https://github.com/chrisjackson-pellicle/paragone-nf.
HybPiper流程已成为用于系统发育分析的目标捕获数据组装的最广泛使用的工具之一。在产生基因座序列之后且在系统发育分析之前,旁系同源物的鉴定是确保准确推断进化关系的关键步骤。使用基因树拓扑结构推断直系同源物组的算法方法计算效率高,并且广泛适用于非模式生物,尤其是在没有已知物种树的情况下。
我们将HybPiper和一个用于推断直系同源物组的流程进行了容器化并扩展了其功能,为处理目标捕获序列数据提供了新的选项,并允许将前者的输出无缝用作后者的输入。此处展示的Singularity容器包含所有依赖项,并且相应的流程(分别为hybpiper-nf和paragone-nf)通过两个Nextflow脚本实现,以便于部署并大幅减少使用它们所需的命令数量。
hybpiper-nf和paragone-nf流程易于安装,为系统发育学界提供了用户友好的体验和可靠的结果。它们被澳大利亚被子植物生命之树项目所使用。这些流程可在https://github.com/chrisjackson-pellicle/hybpiper-nf和https://github.com/chrisjackson-pellicle/paragone-nf获取。