Suppr超能文献

IntAPT:从多个 RNA-seq 谱中整合表型特异转录本的组装。

IntAPT: integrated assembly of phenotype-specific transcripts from multiple RNA-seq profiles.

机构信息

Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA.

Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA.

出版信息

Bioinformatics. 2021 May 5;37(5):650-658. doi: 10.1093/bioinformatics/btaa852.

Abstract

MOTIVATION

High-throughput RNA sequencing has revolutionized the scope and depth of transcriptome analysis. Accurate reconstruction of a phenotype-specific transcriptome is challenging due to the noise and variability of RNA-seq data. This requires computational identification of transcripts from multiple samples of the same phenotype, given the underlying consensus transcript structure.

RESULTS

We present a Bayesian method, integrated assembly of phenotype-specific transcripts (IntAPT), that identifies phenotype-specific isoforms from multiple RNA-seq profiles. IntAPT features a novel two-layer Bayesian model to capture the presence of isoforms at the group layer and to quantify the abundance of isoforms at the sample layer. A spike-and-slab prior is used to model the isoform expression and to enforce the sparsity of expressed isoforms. Dependencies between the existence of isoforms and their expression are modeled explicitly to facilitate parameter estimation. Model parameters are estimated iteratively using Gibbs sampling to infer the joint posterior distribution, from which the presence and abundance of isoforms can reliably be determined. Studies using both simulations and real datasets show that IntAPT consistently outperforms existing methods for the IntAPT. Experimental results demonstrate that, despite sequencing errors, IntAPT exhibits a robust performance among multiple samples, resulting in notably improved identification of expressed isoforms of low abundance.

AVAILABILITY AND IMPLEMENTATION

The IntAPT package is available at http://github.com/henryxushi/IntAPT.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

高通量 RNA 测序技术彻底改变了转录组分析的范围和深度。由于 RNA-seq 数据的噪声和可变性,准确重建表型特异性转录组具有挑战性。这需要在给定潜在共识转录结构的情况下,从同一表型的多个样本中计算识别转录本。

结果

我们提出了一种贝叶斯方法,即集成表型特异性转录本组装(IntAPT),该方法可从多个 RNA-seq 图谱中识别表型特异性异构体。IntAPT 具有新颖的两层贝叶斯模型,可在组层捕获异构体的存在,并在样本层量化异构体的丰度。使用 Spike-and-slab 先验来对异构体表达进行建模,并强制表达异构体的稀疏性。明确建模异构体的存在与其表达之间的依赖性,以促进参数估计。使用 Gibbs 抽样迭代估计模型参数,以推断联合后验分布,从中可以可靠地确定异构体的存在和丰度。使用模拟和真实数据集的研究表明,IntAPT 在 IntAPT 中始终优于现有方法。实验结果表明,尽管存在测序错误,但 IntAPT 在多个样本中表现稳健,从而显著提高了低丰度表达异构体的识别能力。

可用性和实施

IntAPT 软件包可在 http://github.com/henryxushi/IntAPT 上获得。

补充信息

补充数据可在生物信息学在线获得。

相似文献

6
Accurate inference of isoforms from multiple sample RNA-Seq data.从多个样本RNA测序数据中准确推断异构体
BMC Genomics. 2015;16 Suppl 2(Suppl 2):S15. doi: 10.1186/1471-2164-16-S2-S15. Epub 2015 Jan 21.
9
Bayesian transcriptome assembly.贝叶斯转录组组装
Genome Biol. 2014;15(10):501. doi: 10.1186/s13059-014-0501-4.
10
Platform-integrated mRNA isoform quantification.平台整合的 mRNA 异构体定量。
Bioinformatics. 2020 Apr 15;36(8):2466-2473. doi: 10.1093/bioinformatics/btz932.

引用本文的文献

1
Long noncoding RNA study: Genome-wide approaches.长链非编码RNA研究:全基因组方法。
Genes Dis. 2022 Nov 29;10(6):2491-2510. doi: 10.1016/j.gendis.2022.10.024. eCollection 2023 Nov.

本文引用的文献

4
Mechanistic insights into precursor messenger RNA splicing by the spliceosome.剪接体对前体信使 RNA 剪接的机制见解。
Nat Rev Mol Cell Biol. 2017 Nov;18(11):655-670. doi: 10.1038/nrm.2017.86. Epub 2017 Sep 27.
7
Ensembl 2016.Ensembl 2016。
Nucleic Acids Res. 2016 Jan 4;44(D1):D710-6. doi: 10.1093/nar/gkv1157. Epub 2015 Dec 19.
8
PacBio Sequencing and Its Applications.PacBio测序技术及其应用。
Genomics Proteomics Bioinformatics. 2015 Oct;13(5):278-89. doi: 10.1016/j.gpb.2015.08.002. Epub 2015 Nov 2.
10

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验