Department of Ecology, University of Kaiserslautern, Kaiserslautern, Germany.
Department of Molecular Ecology, University of Kaiserslautern, Kaiserslautern, Germany.
Environ Microbiol. 2019 Nov;21(11):4109-4124. doi: 10.1111/1462-2920.14764. Epub 2019 Aug 16.
Effective and precise grouping of highly similar sequences remains a major bottleneck in the evaluation of high-throughput sequencing datasets. Amplicon sequence variants (ASVs) offer a promising alternative that may supersede the widely used operational taxonomic units (OTUs) in environmental sequencing studies. We compared the performance of a recently developed pipeline based on the algorithm DADA2 for obtaining ASVs against a pipeline based on the algorithm SWARM for obtaining OTUs. Illumina-sequencing of 29 individual ciliate species resulted in up to 11 ASVs per species, while SWARM produced up to 19 OTUs per species. To improve the congruency between species diversity and molecular diversity, we applied sequence similarity networks (SSNs) for second-level sequence grouping into network sequence clusters (NSCs). At 100% sequence similarity in SWARM-SSNs, NSC numbers decreased from 7.9-fold overestimation without abundance filter, to 4.5-fold overestimation when an abundance filter was applied. For the DADA2-SSN approach, NSC numbers decreased from 3.5-fold to 3-fold overestimation. Rand index cluster analyses predicted best binning results between 97% and 94% sequence similarity for both DADA2-SSNs and SWARM-SSNs. Depending on the ecological questions addressed in an environmental sequencing study with protists we recommend ASVs as replacement for OTUs, best in combination with SSNs.
在评估高通量测序数据集时,高效准确地对高度相似的序列进行分组仍然是一个主要瓶颈。扩增子序列变异 (ASVs) 提供了一种很有前途的替代方法,可能会在环境测序研究中取代广泛使用的分类操作单元 (OTUs)。我们比较了一种基于算法 DADA2 获得 ASVs 的最新开发的管道和一种基于算法 SWARM 获得 OTUs 的管道的性能。对 29 个个体纤毛虫物种进行 Illumina 测序,每个物种最多可获得 11 个 ASVs,而 SWARM 每个物种最多可获得 19 个 OTUs。为了提高物种多样性和分子多样性之间的一致性,我们应用序列相似性网络 (SSNs) 将二级序列分组为网络序列簇 (NSC)。在 SWARM-SSNs 中达到 100%序列相似性时,没有丰度过滤时 NSC 数量的高估从 7.9 倍减少到应用丰度过滤时的 4.5 倍。对于 DADA2-SSN 方法,NSC 数量的高估从 3.5 倍减少到 3 倍。兰德指数聚类分析预测,对于 DADA2-SSNs 和 SWARM-SSNs,最佳聚类结果在 97%和 94%序列相似性之间。根据在原生动物环境测序研究中解决的生态问题,我们建议将 ASVs 作为 OTUs 的替代品,最好与 SSNs 结合使用。