UMR PHIM, CIRAD, Montpellier, France.
PHIM Plant Health Institute, Univ Montpellier, CIRAD, INRAE, Institut Agro, IRD, Montpellier, France.
Bioinformatics. 2021 Dec 22;38(1):267-269. doi: 10.1093/bioinformatics/btab493.
Previously we presented swarm, an open-source amplicon clustering programme that produces fine-scale molecular operational taxonomic units (OTUs) that are free of arbitrary global clustering thresholds. Here, we present swarm v3 to address issues of contemporary datasets that are growing towards tera-byte sizes.
When compared with previous swarm versions, swarm v3 has modernized C++ source code, reduced memory footprint by up to 50%, optimized CPU-usage and multithreading (more than 7 times faster with default parameters), and it has been extensively tested for its robustness and logic.
Source code and binaries are available at https://github.com/torognes/swarm.
Supplementary data are available at Bioinformatics online.
此前,我们介绍了 swarm,这是一个开源的扩增子聚类程序,可生成无任意全局聚类阈值的精细分子操作分类单元 (OTUs)。在这里,我们介绍 swarm v3,以解决当前数据集不断增长到太字节大小的问题。
与以前的 swarm 版本相比,swarm v3 对 C++ 源代码进行了现代化改造,内存占用减少了 50%,优化了 CPU 使用率和多线程(使用默认参数时速度提高了 7 倍以上),并且已经对其鲁棒性和逻辑进行了广泛测试。
源代码和二进制文件可在 https://github.com/torognes/swarm 上获得。
补充数据可在 Bioinformatics 在线获得。