Kasianova Aleksandra M, Penin Aleksey A, Schelkunov Mikhail I, Kasianov Artem S, Logacheva Maria D, Klepikova Anna V
Institute for Information Transmission, Russian Academy of Sciences, Moscow, Russia.
Skolkovo Institute of Science and Technology, Moscow, Russia.
Plant Methods. 2024 Aug 17;20(1):128. doi: 10.1186/s13007-024-01255-7.
As genomes of many eukaryotic species, especially plants, are large and complex, their de novo sequencing and assembly is still a difficult task despite progress in sequencing technologies. An alternative to genome assembly is the assembly of transcriptome, the set of RNA products of the expressed genes. While a bunch of de novo transcriptome assemblers exists, the challenges of transcriptomes (the existence of isoforms, the uneven expression levels across genes) complicates the generation of high-quality assemblies suitable for downstream analyses.
We developed Trans2express - a web-based tool and a pipeline of de novo hybrid transcriptome assembly and postprocessing based on rnaSPAdes with a set of subsequent filtrations. The pipeline was tested on Arabidopsis thaliana cDNA sequencing data obtained using Illumina and Oxford Nanopore Technologies platforms and three non-model plant species. The comparison of structural characteristics of the transcriptome assembly with reference Arabidopsis genome revealed the high quality of assembled transcriptome with 86.1% of Arabidopsis expressed genes assembled as a single contig. We tested the applicability of the transcriptome assembly for gene expression analysis. For both Arabidopsis and non-model species the results showed high congruence of gene expression levels and sets of differentially expressed genes between analyses based on genome and based on the transcriptome assembly.
We present Trans2express - a protocol for de novo hybrid transcriptome assembly aimed at recovering of a single transcript per gene. We expect this protocol to promote the characterization of transcriptomes and gene expression analysis in non-model plants and web-based tool to be of use to a wide range of plant biologists.
由于许多真核生物物种,特别是植物的基因组庞大且复杂,尽管测序技术取得了进展,但其从头测序和组装仍然是一项艰巨的任务。基因组组装的一种替代方法是转录组组装,即已表达基因的RNA产物集合。虽然存在许多从头转录组组装工具,但转录组的挑战(异构体的存在、基因间表达水平不均一)使生成适用于下游分析的高质量组装变得复杂。
我们开发了Trans2express——一个基于网络的工具以及一个基于rnaSPAdes的从头混合转录组组装和后处理流程,并带有一系列后续过滤步骤。该流程在使用Illumina和牛津纳米孔技术平台获得的拟南芥cDNA测序数据以及三种非模式植物物种上进行了测试。将转录组组装的结构特征与参考拟南芥基因组进行比较,结果显示组装的转录组质量很高,86.1%的拟南芥表达基因被组装为单个重叠群。我们测试了转录组组装在基因表达分析中的适用性。对于拟南芥和非模式物种,结果均表明基于基因组和基于转录组组装的分析之间,基因表达水平和差异表达基因集具有高度一致性。
我们提出了Trans2express——一种用于从头混合转录组组装的方案,旨在每个基因恢复单个转录本。我们期望该方案能促进非模式植物中转录组的表征和基因表达分析,并且这个基于网络的工具能为广大植物生物学家所用。