Einarsson Sveinn V, Rivers Adam R
Department of Microbiology and Cell Science, University of Florida, Gainesville, Florida, USA.
United States Department of Agriculture, Agricultural Research Service, Gainesville, Florida, USA.
Microbiol Spectr. 2024 Nov 5;12(12):e0060124. doi: 10.1128/spectrum.00601-24.
The nuclear ribosomal RNA (rRNA) internal transcribed spacer (ITS) regions are commonly used to identify fungi and other eukaryotic taxa in amplicon sequencing. The highly conserved rRNA regions flanking the ITS are often trimmed before being used for taxonomic assignment. The Python software package ITSxpress rapidly trims single-end or paired-end sequences in FASTQ format for use in amplicon sequence variant clustering methods like DADA2. This new major release of ITSxpress improves the paired-end merging method, simplifies installation of the QIIME 2 ITSxpress plugin, removes major dependencies, adds use cases, and is compatible with newer compression formats. This article discusses the modifications to ITSxpress that improve the output and user experience, leading to a major version increase.IMPORTANCEITSxpress is a sequence trimming method applied to internal transcribed spacer (ITS) amplicon sequences before calling amplicon sequence variants (ASVs). The ITS region is used to understand the composition of eukaryotic microbial communities. ITS sequences provide good taxonomic resolution due to their hypervariability, but are flanked by conserved regions that allow their primers to be more universal. Amplicons generated with such primers contain regions with different evolutionary rates, and trimming these conserved regions results in better taxonomic classification and a more valid set of ASVs. This package can be used for most amplicon sequencing methods including for newer long-read sequencing formats, such as PacBio. ITSxpress can be installed from Bioconda, used as a Docker image, or installed from source code. The package works well with high-performance computing clusters or laptops due to its low-resource requirements.
核糖体核糖核酸(rRNA)内部转录间隔区(ITS)常用于通过扩增子测序鉴定真菌和其他真核生物分类群。在用于分类学赋值之前,通常会修剪ITS侧翼的高度保守rRNA区域。Python软件包ITSxpress可快速修剪FASTQ格式的单端或双端序列,以用于如DADA2等扩增子序列变体聚类方法。ITSxpress的这个新的主要版本改进了双端合并方法,简化了QIIME 2 ITSxpress插件的安装,去除了主要依赖项,增加了用例,并且与更新的压缩格式兼容。本文讨论了对ITSxpress的修改,这些修改改善了输出和用户体验,从而实现了主要版本的升级。
重要性
ITSxpress是一种在调用扩增子序列变体(ASV)之前应用于内部转录间隔区(ITS)扩增子序列的序列修剪方法。ITS区域用于了解真核微生物群落的组成。ITS序列因其高度变异性而提供了良好的分类分辨率,但两侧是保守区域,这使得其引物更具通用性。用此类引物生成的扩增子包含具有不同进化速率的区域,修剪这些保守区域可实现更好的分类学分类和更有效的ASV集。该软件包可用于大多数扩增子测序方法,包括更新的长读长测序格式,如PacBio。ITSxpress可以从Bioconda安装,用作Docker镜像或从源代码安装。由于其低资源需求,该软件包在高性能计算集群或笔记本电脑上运行良好。