Santucci Kristina, Cheng Yuning, Xu Si-Mei, Janitz Michael
School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW 2052, Australia.
Brief Funct Genomics. 2024 Dec 6;23(6):683-694. doi: 10.1093/bfgp/elae031.
Long-read sequencing technologies can capture entire RNA transcripts in a single sequencing read, reducing the ambiguity in constructing and quantifying transcript models in comparison to more common and earlier methods, such as short-read sequencing. Recent improvements in the accuracy of long-read sequencing technologies have expanded the scope for novel splice isoform detection and have also enabled a far more accurate reconstruction of complex splicing patterns and transcriptomes. Additionally, the incorporation and advancements of machine learning and deep learning algorithms in bioinformatic software have significantly improved the reliability of long-read sequencing transcriptomic studies. However, there is a lack of consensus on what bioinformatic tools and pipelines produce the most precise and consistent results. Thus, this review aims to discuss and compare the performance of available methods for novel isoform discovery with long-read sequencing technologies, with 25 tools being presented. Furthermore, this review intends to demonstrate the need for developing standard analytical pipelines, tools, and transcript model conventions for novel isoform discovery and transcriptomic studies.
长读长测序技术能够在一次测序读数中捕获完整的RNA转录本,与更常见的早期方法(如短读长测序)相比,减少了构建和定量转录本模型时的模糊性。长读长测序技术准确性的最新改进扩大了新型剪接异构体检测的范围,还能够更准确地重建复杂的剪接模式和转录组。此外,机器学习和深度学习算法在生物信息软件中的融入和进步显著提高了长读长测序转录组学研究的可靠性。然而,对于哪些生物信息工具和流程能产生最精确和一致的结果,目前尚无共识。因此,本综述旨在讨论和比较使用长读长测序技术进行新型异构体发现的现有方法的性能,并介绍25种工具。此外,本综述旨在表明,对于新型异构体发现和转录组学研究,开发标准分析流程、工具和转录本模型规范的必要性。