He Limuxuan, Zou Quan, Wang Yansu
Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, 610054, People's Republic of China.
Macao Polytechnic University, Macau Peninsula Gomes Street, Macau, 999078, China.
BMC Bioinformatics. 2025 Apr 26;26(1):111. doi: 10.1186/s12859-025-06137-w.
The accessibility of sequencing technologies has enabled meta-transcriptomic studies to provide a deeper understanding of microbial ecology at the transcriptional level. Analyzing omics data involves multiple steps that require the use of various bioinformatics tools. With the increasing availability of public microbiome datasets, conducting meta-analyses can reveal new insights into microbiome activity. However, the reproducibility of data is often compromised due to variations in processing methods for sample omics data. Therefore, it is essential to develop efficient analytical workflows that ensure repeatability, reproducibility, and the traceability of results in microbiome research.
We developed metaTP, a pipeline that integrates bioinformatics tools for analyzing meta-transcriptomic data comprehensively. The pipeline includes quality control, non-coding RNA removal, transcript expression quantification, differential gene expression analysis, functional annotation, and co-expression network analysis. To quantify mRNA expression, we rely on reference indexes built using protein-coding sequences, which help overcome the limitations of database analysis. Additionally, metaTP provides a function for calculating the topological properties of gene co-expression networks, offering an intuitive explanation for correlated gene sets in high-dimensional datasets. The use of metaTP is anticipated to support researchers in addressing microbiota-related biological inquiries and improving the accessibility and interpretation of microbiota RNA-Seq data.
We have created a conda package to integrate the tools into our pipeline, making it a flexible and versatile tool for handling meta-transcriptomic sequencing data. The metaTP pipeline is freely available at: https://github.com/nanbei45/metaTP .
测序技术的普及使宏转录组学研究能够在转录水平上更深入地了解微生物生态学。分析组学数据涉及多个步骤,需要使用各种生物信息学工具。随着公共微生物组数据集的日益增多,进行荟萃分析可以揭示微生物组活动的新见解。然而,由于样本组学数据处理方法的差异,数据的可重复性常常受到影响。因此,开发高效的分析工作流程对于确保微生物组研究结果的可重复性、可再现性和可追溯性至关重要。
我们开发了metaTP,这是一个整合生物信息学工具以全面分析宏转录组数据的流程。该流程包括质量控制、非编码RNA去除、转录本表达定量、差异基因表达分析、功能注释和共表达网络分析。为了定量mRNA表达,我们依赖于使用蛋白质编码序列构建的参考索引,这有助于克服数据库分析的局限性。此外,metaTP提供了一个计算基因共表达网络拓扑特性的功能,为高维数据集中的相关基因集提供了直观的解释。预计metaTP的使用将支持研究人员解决与微生物群相关的生物学问题,并提高微生物群RNA-Seq数据的可及性和解释性。
我们创建了一个conda包,将这些工具集成到我们的流程中,使其成为处理宏转录组测序数据的灵活通用工具。metaTP流程可在以下网址免费获取:https://github.com/nanbei45/metaTP 。