Li Weiyi, Fan Qilian, Yang Yi, Xiao Xiang, Li Jing, Zhang Yu
School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China.
State Key Laboratory of Submarine Geoscience; Key Laboratory of Polar Ecosystem and Climate Change, Ministry of Education; Shanghai Key Laboratory of Polar Life and Environment Sciences; and School of Oceanography, Shanghai Jiao Tong University, 1954 Huashan Road, Shanghai 200030, China.
ISME Commun. 2025 May 29;5(1):ycaf090. doi: 10.1093/ismeco/ycaf090. eCollection 2025 Jan.
Metatranscriptomic analysis is increasingly performed in environments to provide dynamic gene expression information on ecosystems, responding to their changing conditions. Many computational methods have undergone remarkable development in the past years, but a comprehensive benchmark study is still lacking. There are concerns regarding the accuracies of the qualitative and quantitative profilers obtained from metatranscriptomic analysis, especially for the microbiota in extreme environments, most of them are unculturable and lack well-annotated reference genomes. Here, we presented a benchmark experiment that included 10 single-species and their cell or RNA-admixtures with the predefined species compositions and varying evenness, simulating the low annotation rate and high heterogeneity. In total, 1 metagenome sample and 24 metatranscriptome were sequenced for the comparisons of 36 combination of analysis methods for tasks ranging from sample preparation, quality control, rRNA removal, alignment strategies, taxonomic profiling, and transcript quantification. For each part of the workflow mentioned above, corresponding metrics have been established to serve as standards for assessment and comparison. Evaluation revealed the performances and proposed an optimized pipeline named MT-Enviro (MetaTranscriptomic analysis for ENVIROnmental microbiome). Our data and analysis provide a comprehensive framework for benchmarking computational methods with metatranscriptomic analysis. MT-Enviro is implemented in Nextflow and is freely available from https://github.com/Li-Lab-SJTU/MT-Enviro.
宏转录组分析在各种环境中越来越多地被用于提供生态系统的动态基因表达信息,以响应其不断变化的条件。在过去几年中,许多计算方法都取得了显著的发展,但仍缺乏全面的基准研究。人们对宏转录组分析获得的定性和定量分析工具的准确性存在担忧,特别是对于极端环境中的微生物群,其中大多数无法培养且缺乏注释良好的参考基因组。在这里,我们提出了一个基准实验,该实验包括10个单物种及其具有预定义物种组成和不同均匀度的细胞或RNA混合物,模拟低注释率和高异质性。总共对1个宏基因组样本和24个宏转录组进行了测序,以比较36种分析方法组合在样本制备、质量控制、rRNA去除、比对策略、分类分析和转录本定量等任务中的表现。对于上述工作流程的每个部分,都建立了相应的指标作为评估和比较的标准。评估揭示了这些方法的性能,并提出了一种名为MT-Enviro(用于环境微生物组的宏转录组分析)的优化流程。我们的数据和分析为用宏转录组分析对计算方法进行基准测试提供了一个全面的框架。MT-Enviro是用Nextflow实现的,可从https://github.com/Li-Lab-SJTU/MT-Enviro免费获取。