Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Universidad de La Laguna, Santa Cruz de Tenerife, 38010, Spain.
Instituto de Salud Carlos III, CIBER de Enfermedades Respiratorias, Madrid, 28029, Spain.
Bioinformatics. 2021 Jul 12;37(11):1600-1601. doi: 10.1093/bioinformatics/btaa900.
NanoCLUST is an analysis pipeline for the classification of amplicon-based full-length 16S rRNA nanopore reads. It is characterized by an unsupervised read clustering step, based on Uniform Manifold Approximation and Projection (UMAP), followed by the construction of a polished read and subsequent Blast classification. Here, we demonstrate that NanoCLUST performs better than other state-of-the-art software in the characterization of two commercial mock communities, enabling accurate bacterial identification and abundance profile estimation at species-level resolution.
Source code, test data and documentation of NanoCLUST are freely available at https://github.com/genomicsITER/NanoCLUST under MIT License.
Supplementary data are available at Bioinformatics online.
NanoCLUST 是一种用于分类基于扩增子的全长 16S rRNA 纳米孔读取的分析管道。它的特点是基于均摊逼近和投影(UMAP)的无监督读取聚类步骤,然后构建抛光读取并进行后续的 Blast 分类。在这里,我们证明 NanoCLUST 在两个商业模拟群落的特征描述方面优于其他最先进的软件,能够以物种级分辨率进行准确的细菌鉴定和丰度谱估计。
NanoCLUST 的源代码、测试数据和文档可在 MIT 许可证下在 https://github.com/genomicsITER/NanoCLUST 上免费获得。
补充数据可在生物信息学在线获得。