Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, 1050, Belgium.
Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, 1050, Belgium.
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae276.
SIMSApiper is a Nextflow pipeline that creates reliable, structure-informed MSAs of thousands of protein sequences faster than standard structure-based alignment methods. Structural information can be provided by the user or collected by the pipeline from online resources. Parallelization with sequence identity-based subsets can be activated to significantly speed up the alignment process. Finally, the number of gaps in the final alignment can be reduced by leveraging the position of conserved secondary structure elements.
The pipeline is implemented using Nextflow, Python3, and Bash. It is publicly available on github.com/Bio2Byte/simsapiper.
SIMSApiper 是一个 Nextflow 管道,它比标准的基于结构的对齐方法更快地为数千个蛋白质序列创建可靠的、结构信息丰富的 MSAs。结构信息可以由用户提供,也可以由管道从在线资源中收集。可以通过基于序列同一性的子集进行并行化,从而显著加快对齐过程。最后,可以利用保守的二级结构元素的位置来减少最终对齐中的空位数量。
该管道使用 Nextflow、Python3 和 Bash 实现。它在 github.com/Bio2Byte/simsapiper 上公开可用。