Suppr超能文献

通过TranSigner对长RNA测序读数进行准确分配来增强转录组表达定量。

Enhancing transcriptome expression quantification through accurate assignment of long RNA sequencing reads with TranSigner.

作者信息

Ji Hyun Joo, Pertea Mihaela

机构信息

Center for Computational Biology, Johns Hopkins University; Baltimore, MD.

Department of Computer Science, Johns Hopkins University; Baltimore, MD.

出版信息

bioRxiv. 2024 Aug 17:2024.04.13.589356. doi: 10.1101/2024.04.13.589356.

Abstract

Recently developed long-read RNA sequencing technologies promise to provide a more accurate and comprehensive view of transcriptomes compared to short-read sequencers, primarily due to their capability to achieve full-length sequencing of transcripts. However, realizing this potential requires computational tools tailored to process long reads, which exhibit a higher error rate than short reads. Existing methods for assembling and quantifying long-read data often disagree on expressed transcripts and their abundance levels, leading researchers to lack confidence in the transcriptomes produced using this data. One approach to address the uncertainties in transcriptome assembly and quantification is by assigning the long reads to transcripts, enabling a more detailed characterization of transcript support at the read level. Here, we introduce TranSigner, a versatile tool that assigns long reads to any input transcriptome. TranSigner consists of three consecutive modules performing: read alignment to the given transcripts, computation of read-to-transcript compatibility based on alignment scores and positions, and execution of an expectation-maximization algorithm to probabilistically assign reads to transcripts and estimate transcript abundances. Using simulated data and experimental datasets from three well-studied organisms - , and - we demonstrate that TranSigner achieves accurate read assignments, obtaining higher accuracy in transcript abundance estimation compared to existing tools.

摘要

与短读长测序仪相比,最近开发的长读长RNA测序技术有望提供更准确、更全面的转录组视图,这主要归功于它们能够实现转录本的全长测序。然而,要实现这一潜力,需要有专门用于处理长读长的计算工具,因为长读长的错误率比短读长高。现有的长读长数据组装和定量方法在表达的转录本及其丰度水平上常常存在分歧,这使得研究人员对使用这些数据生成的转录组缺乏信心。解决转录组组装和定量不确定性的一种方法是将长读长分配到转录本上,从而能够在读取水平上更详细地描述转录本支持情况。在此,我们介绍TranSigner,这是一种通用工具,可将长读长分配到任何输入的转录组。TranSigner由三个连续的模块组成,分别执行:将读取与给定的转录本进行比对、根据比对分数和位置计算读取与转录本的兼容性,以及执行期望最大化算法以概率方式将读取分配到转录本并估计转录本丰度。使用来自三种深入研究的生物——、和——的模拟数据和实验数据集,我们证明TranSigner实现了准确的读取分配,与现有工具相比,在转录本丰度估计方面获得了更高的准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c5b8/11343119/068d6aec214b/nihpp-2024.04.13.589356v3-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验