Pardo-Palacios Francisco J, Arzalluz-Luque Angeles, Kondratova Liudmyla, Salguero Pedro, Mestre-Tomás Jorge, Amorín Rocío, Estevan-Morió Eva, Liu Tianyuan, Nanni Adalena, McIntyre Lauren, Tseng Elizabeth, Conesa Ana
bioRxiv. 2023 Jun 3:2023.05.17.541248. doi: 10.1101/2023.05.17.541248.
The emergence of long-read RNA sequencing (lrRNA-seq) has provided an unprecedented opportunity to analyze transcriptomes at isoform resolution. However, the technology is not free from biases, and transcript models inferred from these data require quality control and curation. In this study, we introduce SQANTI3, a tool specifically designed to perform quality analysis on transcriptomes constructed using lrRNA-seq data. SQANTI3 provides an extensive naming framework to describe transcript model diversity in comparison to the reference transcriptome. Additionally, the tool incorporates a wide range of metrics to characterize various structural properties of transcript models, such as transcription start and end sites, splice junctions, and other structural features. These metrics can be utilized to filter out potential artifacts. Moreover, SQANTI3 includes a Rescue module that prevents the loss of known genes and transcripts exhibiting evidence of expression but displaying low-quality features. Lastly, SQANTI3 incorporates IsoAnnotLite, which enables functional annotation at the isoform level and facilitates functional iso-transcriptomics analyses. We demonstrate the versatility of SQANTI3 in analyzing different data types, isoform reconstruction pipelines, and sequencing platforms, and how it provides novel biological insights into isoform biology. The SQANTI3 software is available at https://github.com/ConesaLab/SQANTI3 .
长读长RNA测序(lrRNA-seq)的出现为以异构体分辨率分析转录组提供了前所未有的机会。然而,该技术并非没有偏差,从这些数据推断出的转录本模型需要进行质量控制和整理。在本研究中,我们介绍了SQANTI3,这是一种专门设计用于对使用lrRNA-seq数据构建的转录组进行质量分析的工具。SQANTI3提供了一个广泛的命名框架,用于描述与参考转录组相比的转录本模型多样性。此外,该工具纳入了广泛的指标,以表征转录本模型的各种结构特性,如转录起始和终止位点、剪接连接以及其他结构特征。这些指标可用于过滤掉潜在的伪影。此外,SQANTI3包括一个拯救模块,可防止已知基因和显示表达证据但具有低质量特征的转录本丢失。最后,SQANTI3纳入了IsoAnnotLite,它能够在异构体水平进行功能注释,并促进功能异构体转录组学分析。我们展示了SQANTI3在分析不同数据类型、异构体重建流程和测序平台方面的多功能性,以及它如何为异构体生物学提供新的生物学见解。SQANTI3软件可在https://github.com/ConesaLab/SQANTI3获取。