Infectious Disease Service, Department of Medicine, and.
Immunology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center (MSKCC), New York, New York, USA.
JCI Insight. 2022 Jan 11;7(1):e151663. doi: 10.1172/jci.insight.151663.
Identification and analysis of fungal communities commonly rely on internal transcribed spacer-based (ITS-based) amplicon sequencing. There is no gold standard used to infer and classify fungal constituents since methodologies have been adapted from analyses of bacterial communities. To achieve high-resolution inference of fungal constituents, we customized a DADA2-based pipeline using a mix of 11 medically relevant fungi. While DADA2 allowed the discrimination of ITS1 sequences differing by single nucleotides, quality filtering, sequencing bias, and database selection were identified as key variables determining the accuracy of sample inference. Due to species-specific differences in sequencing quality, default filtering settings removed most reads that originated from Aspergillus species, Saccharomyces cerevisiae, and Candida glabrata. By fine-tuning the quality filtering process, we achieved an improved representation of the fungal communities. By adapting a wobble nucleotide in the ITS1 forward primer region, we further increased the yield of S. cerevisiae and C. glabrata sequences. Finally, we showed that a BLAST-based algorithm based on the UNITE+INSD or the NCBI NT database achieved a higher reliability in species-level taxonomic annotation compared with the naive Bayesian classifier implemented in DADA2. These steps optimized a robust fungal ITS1 sequencing pipeline that, in most instances, enabled species-level assignment of community members.
真菌群落的鉴定和分析通常依赖于基于内部转录间隔区(ITS 区)的扩增子测序。由于方法学是从细菌群落分析中改编而来,因此没有用于推断和分类真菌成分的金标准。为了实现真菌成分的高分辨率推断,我们使用混合的 11 种医学相关真菌定制了基于 DADA2 的管道。虽然 DADA2 允许区分单个核苷酸差异的 ITS1 序列,但质量过滤、测序偏差和数据库选择被确定为确定样本推断准确性的关键变量。由于测序质量的物种特异性差异,默认过滤设置会删除大多数源自曲霉属、酿酒酵母和光滑念珠菌的reads。通过微调质量过滤过程,我们提高了真菌群落的代表性。通过在 ITS1 正向引物区域适应摆动核苷酸,我们进一步提高了酿酒酵母和光滑念珠菌序列的产量。最后,我们表明,基于 UNITE+INSD 或 NCBI NT 数据库的 BLAST 算法在物种水平分类注释方面比 DADA2 中实现的朴素贝叶斯分类器具有更高的可靠性。这些步骤优化了一个稳健的真菌 ITS1 测序管道,在大多数情况下,能够对群落成员进行物种水平的分配。