Scripps Institution of Oceanography, UC San Diego, 9500 Gilman Dr, La Jolla, CA, 92093, USA.
Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 16000, Prague 6, Czech Republic.
BMC Bioinformatics. 2023 Apr 4;24(1):133. doi: 10.1186/s12859-023-05254-8.
RNA-seq followed by de novo transcriptome assembly has been a transformative technique in biological research of non-model organisms, but the computational processing of RNA-seq data entails many different software tools. The complexity of these de novo transcriptomics workflows therefore presents a major barrier for researchers to adopt best-practice methods and up-to-date versions of software.
Here we present a streamlined and universal de novo transcriptome assembly and annotation pipeline, transXpress, implemented in Snakemake. transXpress supports two popular assembly programs, Trinity and rnaSPAdes, and allows parallel execution on heterogeneous cluster computing hardware.
transXpress simplifies the use of best-practice methods and up-to-date software for de novo transcriptome assembly, and produces standardized output files that can be mined using SequenceServer to facilitate rapid discovery of new genes and proteins in non-model organisms.
RNA-seq 结合从头转录组组装已成为非模式生物生物学研究中的一项变革性技术,但 RNA-seq 数据的计算处理需要许多不同的软件工具。因此,这些从头转录组学工作流程的复杂性对研究人员采用最佳实践方法和最新版本的软件构成了重大障碍。
在这里,我们展示了一个简化和通用的从头转录组组装和注释管道 transXpress,它是在 Snakemake 中实现的。transXpress 支持两个流行的组装程序 Trinity 和 rnaSPAdes,并允许在异构集群计算硬件上并行执行。
transXpress 简化了最佳实践方法和最新软件在从头转录组组装中的使用,并生成了标准化的输出文件,可以使用 SequenceServer 进行挖掘,以促进非模式生物中新基因和蛋白质的快速发现。