• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从RNA测序数据中快速准确地进行转录本表达的近似推断。

Fast and accurate approximate inference of transcript expression from RNA-seq data.

作者信息

Hensman James, Papastamoulis Panagiotis, Glaus Peter, Honkela Antti, Rattray Magnus

机构信息

Sheffield Institute for Translational Neuroscience (SITraN), Sheffield, UK.

Faculty of Life Sciences.

出版信息

Bioinformatics. 2015 Dec 15;31(24):3881-9. doi: 10.1093/bioinformatics/btv483. Epub 2015 Aug 26.

DOI:10.1093/bioinformatics/btv483
PMID:26315907
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4673974/
Abstract

MOTIVATION

Assigning RNA-seq reads to their transcript of origin is a fundamental task in transcript expression estimation. Where ambiguities in assignments exist due to transcripts sharing sequence, e.g. alternative isoforms or alleles, the problem can be solved through probabilistic inference. Bayesian methods have been shown to provide accurate transcript abundance estimates compared with competing methods. However, exact Bayesian inference is intractable and approximate methods such as Markov chain Monte Carlo and Variational Bayes (VB) are typically used. While providing a high degree of accuracy and modelling flexibility, standard implementations can be prohibitively slow for large datasets and complex transcriptome annotations.

RESULTS

We propose a novel approximate inference scheme based on VB and apply it to an existing model of transcript expression inference from RNA-seq data. Recent advances in VB algorithmics are used to improve the convergence of the algorithm beyond the standard Variational Bayes Expectation Maximization algorithm. We apply our algorithm to simulated and biological datasets, demonstrating a significant increase in speed with only very small loss in accuracy of expression level estimation. We carry out a comparative study against seven popular alternative methods and demonstrate that our new algorithm provides excellent accuracy and inter-replicate consistency while remaining competitive in computation time.

AVAILABILITY AND IMPLEMENTATION

The methods were implemented in R and C++, and are available as part of the BitSeq project at github.com/BitSeq. The method is also available through the BitSeq Bioconductor package. The source code to reproduce all simulation results can be accessed via github.com/BitSeq/BitSeqVB_benchmarking.

摘要

动机

将RNA测序读数与其起源转录本进行匹配是转录本表达估计中的一项基本任务。当由于转录本共享序列(例如可变剪接异构体或等位基因)而存在分配模糊性时,可以通过概率推理来解决该问题。与其他竞争方法相比,贝叶斯方法已被证明能提供准确的转录本丰度估计。然而,精确的贝叶斯推理是难以处理的,通常使用马尔可夫链蒙特卡罗和变分贝叶斯(VB)等近似方法。虽然这些方法具有高度的准确性和建模灵活性,但对于大型数据集和复杂的转录组注释,标准实现可能会非常缓慢。

结果

我们提出了一种基于VB的新型近似推理方案,并将其应用于从RNA测序数据进行转录本表达推理的现有模型。VB算法的最新进展被用于改进算法的收敛性,超越了标准的变分贝叶斯期望最大化算法。我们将我们的算法应用于模拟和生物数据集,结果表明在表达水平估计的准确性仅有非常小的损失的情况下,速度有显著提高。我们与七种流行的替代方法进行了比较研究,结果表明我们的新算法在计算时间上具有竞争力的同时,还提供了出色的准确性和重复间的一致性。

可用性和实现方式

这些方法是用R和C++实现的,可作为github.com/BitSeq上BitSeq项目的一部分获取。该方法也可以通过BitSeq Bioconductor包获得。可通过github.com/BitSeq/BitSeqVB_benchmarking访问用于重现所有模拟结果的源代码。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e81/4673974/df924487ef8f/btv483f7p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e81/4673974/b7044dbdc256/btv483f1p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e81/4673974/aac0cdc0008a/btv483f2p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e81/4673974/8280cc6f48c5/btv483f3p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e81/4673974/6d5cfe88e682/btv483f4p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e81/4673974/59773508a0b9/btv483f5p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e81/4673974/b35b06e4965b/btv483f6p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e81/4673974/df924487ef8f/btv483f7p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e81/4673974/b7044dbdc256/btv483f1p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e81/4673974/aac0cdc0008a/btv483f2p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e81/4673974/8280cc6f48c5/btv483f3p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e81/4673974/6d5cfe88e682/btv483f4p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e81/4673974/59773508a0b9/btv483f5p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e81/4673974/b35b06e4965b/btv483f6p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e81/4673974/df924487ef8f/btv483f7p.jpg

相似文献

1
Fast and accurate approximate inference of transcript expression from RNA-seq data.从RNA测序数据中快速准确地进行转录本表达的近似推断。
Bioinformatics. 2015 Dec 15;31(24):3881-9. doi: 10.1093/bioinformatics/btv483. Epub 2015 Aug 26.
2
TIGAR: transcript isoform abundance estimation method with gapped alignment of RNA-Seq data by variational Bayesian inference.TIGAR:一种通过变分贝叶斯推断进行 RNA-Seq 数据缺口对齐的转录本丰度估计方法。
Bioinformatics. 2013 Sep 15;29(18):2292-9. doi: 10.1093/bioinformatics/btt381. Epub 2013 Jul 2.
3
Improved variational Bayes inference for transcript expression estimation.用于转录本表达估计的改进变分贝叶斯推理
Stat Appl Genet Mol Biol. 2014 Apr 1;13(2):203-16. doi: 10.1515/sagmb-2013-0054.
4
Identifying differentially expressed transcripts from RNA-seq data with biological variation.从具有生物学变异的 RNA-seq 数据中鉴定差异表达的转录本。
Bioinformatics. 2012 Jul 1;28(13):1721-8. doi: 10.1093/bioinformatics/bts260. Epub 2012 May 3.
5
A comparison of computational algorithms for the Bayesian analysis of clinical trials.临床试验贝叶斯分析的计算算法比较。
Clin Trials. 2024 Dec;21(6):689-700. doi: 10.1177/17407745241247334. Epub 2024 May 16.
6
TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads.TIGAR2:利用更长的RNA测序读段对转录本异构体表达进行灵敏且准确的估计。
BMC Genomics. 2014;15 Suppl 10(Suppl 10):S5. doi: 10.1186/1471-2164-15-S10-S5. Epub 2014 Dec 12.
7
Fast Bayesian whole-brain fMRI analysis with spatial 3D priors.具有空间3D先验的快速贝叶斯全脑功能磁共振成像分析。
Neuroimage. 2017 Feb 1;146:211-225. doi: 10.1016/j.neuroimage.2016.11.040. Epub 2016 Nov 19.
8
Parallel Metropolis coupled Markov chain Monte Carlo for Bayesian phylogenetic inference.用于贝叶斯系统发育推断的并行 metropolis 耦合马尔可夫链蒙特卡罗方法
Bioinformatics. 2004 Feb 12;20(3):407-15. doi: 10.1093/bioinformatics/btg427. Epub 2004 Jan 22.
9
Modelling capture efficiency of single-cell RNA-sequencing data improves inference of transcriptome-wide burst kinetics.单细胞 RNA 测序数据的捕获效率建模可提高对转录组范围爆发动力学的推断。
Bioinformatics. 2023 Jul 1;39(7). doi: 10.1093/bioinformatics/btad395.
10
Variational Bayes inference for hidden Markov diagnostic classification models.隐马尔可夫诊断分类模型的变分贝叶斯推断。
Br J Math Stat Psychol. 2024 Feb;77(1):55-79. doi: 10.1111/bmsp.12308. Epub 2023 May 30.

引用本文的文献

1
Oarfish: enhanced probabilistic modeling leads to improved accuracy in long read transcriptome quantification.皇带鱼:增强的概率模型可提高长读长转录组定量的准确性。
Bioinformatics. 2025 Jul 1;41(Supplement_1):i304-i313. doi: 10.1093/bioinformatics/btaf240.
2
Oarfish: Enhanced probabilistic modeling leads to improved accuracy in long read transcriptome quantification.皇带鱼:增强的概率模型可提高长读长转录组定量的准确性。
bioRxiv. 2024 Mar 1:2024.02.28.582591. doi: 10.1101/2024.02.28.582591.
3
Perplexity: evaluating transcript abundance estimation in the absence of ground truth.

本文引用的文献

1
Fast Nonparametric Clustering of Structured Time-Series.快速非参数结构化时间序列聚类。
IEEE Trans Pattern Anal Mach Intell. 2015 Feb;37(2):383-93. doi: 10.1109/TPAMI.2014.2318711.
2
TIGAR2: sensitive and accurate estimation of transcript isoform expression with longer RNA-Seq reads.TIGAR2:利用更长的RNA测序读段对转录本异构体表达进行灵敏且准确的估计。
BMC Genomics. 2014;15 Suppl 10(Suppl 10):S5. doi: 10.1186/1471-2164-15-S10-S5. Epub 2014 Dec 12.
3
A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium.
困惑度:在缺乏真实对照的情况下评估转录本丰度估计
Algorithms Mol Biol. 2022 Mar 25;17(1):6. doi: 10.1186/s13015-022-00214-y.
4
Deriving Ranges of Optimal Estimated Transcript Expression due to Nonidentifiability.由于不可识别性导致的最优转录本表达范围的推导。
J Comput Biol. 2022 Feb;29(2):121-139. doi: 10.1089/cmb.2021.0444. Epub 2022 Jan 17.
5
Combining Multiple RNA-Seq Data Analysis Algorithms Using Machine Learning Improves Differential Isoform Expression Analysis.使用机器学习结合多种RNA-Seq数据分析算法可改善差异异构体表达分析。
Methods Protoc. 2021 Sep 27;4(4):68. doi: 10.3390/mps4040068.
6
Polee: RNA-Seq analysis using approximate likelihood.波利:使用近似似然法的RNA测序分析
NAR Genom Bioinform. 2021 May 25;3(2):lqab046. doi: 10.1093/nargab/lqab046. eCollection 2021 Jun.
7
Exact transcript quantification over splice graphs.通过剪接图进行精确的转录本定量分析。
Algorithms Mol Biol. 2021 May 10;16(1):5. doi: 10.1186/s13015-021-00184-7.
8
Alignment and mapping methodology influence transcript abundance estimation.比对和映射方法会影响转录本丰度的估计。
Genome Biol. 2020 Sep 7;21(1):239. doi: 10.1186/s13059-020-02151-8.
9
A Bayesian framework for inter-cellular information sharing improves dscRNA-seq quantification.贝叶斯框架用于细胞间信息共享可提高 dscRNA-seq 的定量分析。
Bioinformatics. 2020 Jul 1;36(Suppl_1):i292-i299. doi: 10.1093/bioinformatics/btaa450.
10
Detecting, Categorizing, and Correcting Coverage Anomalies of RNA-Seq Quantification.检测、分类和纠正 RNA-Seq 定量的覆盖异常。
Cell Syst. 2019 Dec 18;9(6):589-599.e7. doi: 10.1016/j.cels.2019.10.005. Epub 2019 Nov 27.
测序质量控制联盟对RNA测序准确性、可重复性和信息含量的全面评估。
Nat Biotechnol. 2014 Sep;32(9):903-14. doi: 10.1038/nbt.2957. Epub 2014 Aug 24.
4
QUANTIFYING ALTERNATIVE SPLICING FROM PAIRED-END RNA-SEQUENCING DATA.从双端RNA测序数据中定量可变剪接
Ann Appl Stat. 2014 Mar;8(1):309-330. doi: 10.1214/13-aoas687.
5
Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms.旗鱼能够使用轻量级算法从RNA测序读段中进行无比对的异构体定量分析。
Nat Biotechnol. 2014 May;32(5):462-4. doi: 10.1038/nbt.2862. Epub 2014 Apr 20.
6
Improved variational Bayes inference for transcript expression estimation.用于转录本表达估计的改进变分贝叶斯推理
Stat Appl Genet Mol Biol. 2014 Apr 1;13(2):203-16. doi: 10.1515/sagmb-2013-0054.
7
Design of RNA splicing analysis null models for post hoc filtering of Drosophila head RNA-Seq data with the splicing analysis kit (Spanki).利用剪接分析试剂盒(Spanki)对果蝇头部 RNA-Seq 数据进行事后过滤的 RNA 剪接分析零模型设计。
BMC Bioinformatics. 2013 Nov 9;14:320. doi: 10.1186/1471-2105-14-320.
8
TIGAR: transcript isoform abundance estimation method with gapped alignment of RNA-Seq data by variational Bayesian inference.TIGAR:一种通过变分贝叶斯推断进行 RNA-Seq 数据缺口对齐的转录本丰度估计方法。
Bioinformatics. 2013 Sep 15;29(18):2292-9. doi: 10.1093/bioinformatics/btt381. Epub 2013 Jul 2.
9
Differential analysis of gene regulation at transcript resolution with RNA-seq.基于 RNA-seq 的转录分辨率下基因调控的差异分析。
Nat Biotechnol. 2013 Jan;31(1):46-53. doi: 10.1038/nbt.2450. Epub 2012 Dec 9.
10
Ensembl 2013.Ensembl 2013.
Nucleic Acids Res. 2013 Jan;41(Database issue):D48-55. doi: 10.1093/nar/gks1236. Epub 2012 Nov 30.