Suppr超能文献

SparseIso:一种从 RNA-seq 数据中识别选择性剪接异构体的新型贝叶斯方法。

SparseIso: a novel Bayesian approach to identify alternatively spliced isoforms from RNA-seq data.

机构信息

Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA.

Department of Pathology, Johns Hopkins Medical Institutions, Baltimore, MD 21231, USA.

出版信息

Bioinformatics. 2018 Jan 1;34(1):56-63. doi: 10.1093/bioinformatics/btx557.

Abstract

MOTIVATION

Recent advances in high-throughput RNA sequencing (RNA-seq) technologies have made it possible to reconstruct the full transcriptome of various types of cells. It is important to accurately assemble transcripts or identify isoforms for an improved understanding of molecular mechanisms in biological systems.

RESULTS

We have developed a novel Bayesian method, SparseIso, to reliably identify spliced isoforms from RNA-seq data. A spike-and-slab prior is incorporated into the Bayesian model to enforce the sparsity for isoform identification, effectively alleviating the problem of overfitting. A Gibbs sampling procedure is further developed to simultaneously identify and quantify transcripts from RNA-seq data. With the sampling approach, SparseIso estimates the joint distribution of all candidate transcripts, resulting in a significantly improved performance in detecting lowly expressed transcripts and multiple expressed isoforms of genes. Both simulation study and real data analysis have demonstrated that the proposed SparseIso method significantly outperforms existing methods for improved transcript assembly and isoform identification.

AVAILABILITY AND IMPLEMENTATION

The SparseIso package is available at http://github.com/henryxushi/SparseIso.

CONTACT

xuan@vt.edu.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

高通量 RNA 测序 (RNA-seq) 技术的最新进展使得重建各种类型细胞的完整转录组成为可能。准确组装转录本或识别异构体对于深入了解生物系统中的分子机制非常重要。

结果

我们开发了一种新颖的贝叶斯方法 SparseIso,可从 RNA-seq 数据中可靠地识别拼接异构体。在贝叶斯模型中引入了一个 Spike-and-Slab 先验,以强制异构体识别的稀疏性,有效地解决了过拟合问题。进一步开发了 Gibbs 采样过程,以从 RNA-seq 数据中同时识别和定量转录本。通过采样方法,SparseIso 估计了所有候选转录本的联合分布,从而在检测低表达转录本和基因的多个表达异构体方面显著提高了性能。模拟研究和真实数据分析都表明,所提出的 SparseIso 方法在提高转录本组装和异构体识别方面明显优于现有方法。

可用性和实现

SparseIso 包可在 http://github.com/henryxushi/SparseIso 上获得。

联系方式

xuan@vt.edu

补充信息

补充数据可在 Bioinformatics 在线获得。

相似文献

7

本文引用的文献

3
The UCSC Genome Browser database: 2015 update.加州大学圣克鲁兹分校基因组浏览器数据库:2015年更新
Nucleic Acids Res. 2015 Jan;43(Database issue):D670-81. doi: 10.1093/nar/gku1177. Epub 2014 Nov 26.
4
Bayesian transcriptome assembly.贝叶斯转录组组装
Genome Biol. 2014;15(10):501. doi: 10.1186/s13059-014-0501-4.
6
Characterization of the human ESC transcriptome by hybrid sequencing.通过杂交测序对人类 ESC 转录组进行表征。
Proc Natl Acad Sci U S A. 2013 Dec 10;110(50):E4821-30. doi: 10.1073/pnas.1320101110. Epub 2013 Nov 26.
7
The UCSC Genome Browser database: 2014 update.UCSC 基因组浏览器数据库:2014 年更新。
Nucleic Acids Res. 2014 Jan;42(Database issue):D764-70. doi: 10.1093/nar/gkt1168. Epub 2013 Nov 21.
8
RefSeq: an update on mammalian reference sequences.RefSeq:哺乳动物参考序列的更新。
Nucleic Acids Res. 2014 Jan;42(Database issue):D756-63. doi: 10.1093/nar/gkt1114. Epub 2013 Nov 19.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验