Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Chiba, Japan.
Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan.
Bioinformatics. 2018 Jul 15;34(14):2490-2492. doi: 10.1093/bioinformatics/bty121.
We report an update for the MAFFT multiple sequence alignment program to enable parallel calculation of large numbers of sequences. The G-INS-1 option of MAFFT was recently reported to have higher accuracy than other methods for large data, but this method has been impractical for most large-scale analyses, due to the requirement of large computational resources. We introduce a scalable variant, G-large-INS-1, which has equivalent accuracy to G-INS-1 and is applicable to 50 000 or more sequences.
This feature is available in MAFFT versions 7.355 or later at https://mafft.cbrc.jp/alignment/software/mpi.html.
Supplementary data are available at Bioinformatics online.
我们报告了 MAFFT 多序列对齐程序的更新,以实现大量序列的并行计算。最近有报道称,MAFFT 的 G-INS-1 选项在处理大数据时比其他方法具有更高的准确性,但由于需要大量计算资源,该方法对于大多数大规模分析来说并不实用。我们引入了一种可扩展的变体 G-large-INS-1,它与 G-INS-1 具有相同的准确性,适用于 50000 个或更多的序列。
此功能在 MAFFT 版本 7.355 或更高版本中可用,网址为 https://mafft.cbrc.jp/alignment/software/mpi.html。
补充数据可在 Bioinformatics 在线获得。