• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

标记:一种用于 RNA-Seq 数据精确丰度定量和差异分析的新型贝叶斯模型。

BADGE: a novel Bayesian model for accurate abundance quantification and differential analysis of RNA-Seq data.

出版信息

BMC Bioinformatics. 2014;15 Suppl 9(Suppl 9):S6. doi: 10.1186/1471-2105-15-S9-S6. Epub 2014 Sep 10.

DOI:10.1186/1471-2105-15-S9-S6
PMID:25252852
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4168709/
Abstract

BACKGROUND

Recent advances in RNA sequencing (RNA-Seq) technology have offered unprecedented scope and resolution for transcriptome analysis. However, precise quantification of mRNA abundance and identification of differentially expressed genes are complicated due to biological and technical variations in RNA-Seq data.

RESULTS

We systematically study the variation in count data and dissect the sources of variation into between-sample variation and within-sample variation. A novel Bayesian framework is developed for joint estimate of gene level mRNA abundance and differential state, which models the intrinsic variability in RNA-Seq to improve the estimation. Specifically, a Poisson-Lognormal model is incorporated into the Bayesian framework to model within-sample variation; a Gamma-Gamma model is then used to model between-sample variation, which accounts for over-dispersion of read counts among multiple samples. Simulation studies, where sequencing counts are synthesized based on parameters learned from real datasets, have demonstrated the advantage of the proposed method in both quantification of mRNA abundance and identification of differentially expressed genes. Moreover, performance comparison on data from the Sequencing Quality Control (SEQC) Project with ERCC spike-in controls has shown that the proposed method outperforms existing RNA-Seq methods in differential analysis. Application on breast cancer dataset has further illustrated that the proposed Bayesian model can 'blindly' estimate sources of variation caused by sequencing biases.

CONCLUSIONS

We have developed a novel Bayesian hierarchical approach to investigate within-sample and between-sample variations in RNA-Seq data. Simulation and real data applications have validated desirable performance of the proposed method. The software package is available at http://www.cbil.ece.vt.edu/software.htm.

摘要

背景

RNA 测序(RNA-Seq)技术的最新进展为转录组分析提供了前所未有的范围和分辨率。然而,由于 RNA-Seq 数据中的生物学和技术变化,mRNA 丰度的精确定量和差异表达基因的鉴定变得复杂。

结果

我们系统地研究了计数数据的变化,并将变化的来源分解为样品间的变化和样品内的变化。开发了一种新的贝叶斯框架,用于联合估计基因水平的 mRNA 丰度和差异状态,该框架对 RNA-Seq 中的固有变异性进行建模,以改善估计。具体来说,将泊松-对数正态模型纳入贝叶斯框架中以模拟样品内的变化;然后使用伽马-伽马模型来模拟样品间的变化,该模型考虑了多个样品中读取计数的过分散。基于从真实数据集学习到的参数合成测序计数的模拟研究表明,该方法在 mRNA 丰度的定量和差异表达基因的鉴定方面都具有优势。此外,与具有 ERCC Spike-in 对照的测序质量控制(SEQC)项目的数据进行的性能比较表明,该方法在差异分析方面优于现有的 RNA-Seq 方法。在乳腺癌数据集上的应用进一步说明了,所提出的贝叶斯模型可以“盲目”估计测序偏差引起的变异源。

结论

我们开发了一种新的贝叶斯分层方法来研究 RNA-Seq 数据中的样品内和样品间变化。模拟和真实数据应用验证了所提出方法的良好性能。该软件包可在 http://www.cbil.ece.vt.edu/software.htm 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab7/4168709/04d8eda41902/1471-2105-15-S9-S6-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab7/4168709/71c9f6e6d54b/1471-2105-15-S9-S6-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab7/4168709/7db47cb65b0f/1471-2105-15-S9-S6-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab7/4168709/79a441d71af2/1471-2105-15-S9-S6-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab7/4168709/fe9d9c4fffc6/1471-2105-15-S9-S6-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab7/4168709/3cbbd181be15/1471-2105-15-S9-S6-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab7/4168709/04d8eda41902/1471-2105-15-S9-S6-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab7/4168709/71c9f6e6d54b/1471-2105-15-S9-S6-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab7/4168709/7db47cb65b0f/1471-2105-15-S9-S6-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab7/4168709/79a441d71af2/1471-2105-15-S9-S6-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab7/4168709/fe9d9c4fffc6/1471-2105-15-S9-S6-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab7/4168709/3cbbd181be15/1471-2105-15-S9-S6-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ab7/4168709/04d8eda41902/1471-2105-15-S9-S6-6.jpg

相似文献

1
BADGE: a novel Bayesian model for accurate abundance quantification and differential analysis of RNA-Seq data.标记:一种用于 RNA-Seq 数据精确丰度定量和差异分析的新型贝叶斯模型。
BMC Bioinformatics. 2014;15 Suppl 9(Suppl 9):S6. doi: 10.1186/1471-2105-15-S9-S6. Epub 2014 Sep 10.
2
NPEBseq: nonparametric empirical bayesian-based procedure for differential expression analysis of RNA-seq data.NPEBseq:一种基于非参数经验贝叶斯的 RNA-seq 数据差异表达分析方法。
BMC Bioinformatics. 2013 Aug 27;14:262. doi: 10.1186/1471-2105-14-262.
3
Identifying differentially expressed transcripts from RNA-seq data with biological variation.从具有生物学变异的 RNA-seq 数据中鉴定差异表达的转录本。
Bioinformatics. 2012 Jul 1;28(13):1721-8. doi: 10.1093/bioinformatics/bts260. Epub 2012 May 3.
4
Estimation of isoform expression in RNA-seq data using a hierarchical Bayesian model.使用分层贝叶斯模型估计RNA测序数据中的异构体表达。
J Bioinform Comput Biol. 2015 Dec;13(6):1542001. doi: 10.1142/S0219720015420019. Epub 2015 Aug 11.
5
SparseIso: a novel Bayesian approach to identify alternatively spliced isoforms from RNA-seq data.SparseIso:一种从 RNA-seq 数据中识别选择性剪接异构体的新型贝叶斯方法。
Bioinformatics. 2018 Jan 1;34(1):56-63. doi: 10.1093/bioinformatics/btx557.
6
EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments.EBSeq:RNA-seq 实验中用于推理的经验贝叶斯层次模型。
Bioinformatics. 2013 Apr 15;29(8):1035-43. doi: 10.1093/bioinformatics/btt087. Epub 2013 Feb 21.
7
NSMAP: a method for spliced isoforms identification and quantification from RNA-Seq.NSMAP:一种从 RNA-Seq 中鉴定和定量剪接异构体的方法。
BMC Bioinformatics. 2011 May 16;12:162. doi: 10.1186/1471-2105-12-162.
8
Benchmarking differential expression analysis tools for RNA-Seq: normalization-based vs. log-ratio transformation-based methods.RNA-Seq 差异表达分析工具的基准测试:基于标准化与基于对数比变换的方法。
BMC Bioinformatics. 2018 Jul 18;19(1):274. doi: 10.1186/s12859-018-2261-8.
9
A flexible count data model to fit the wide diversity of expression profiles arising from extensively replicated RNA-seq experiments.一种灵活的计数数据模型,可适用于广泛复制的 RNA-seq 实验所产生的广泛多样化的表达谱。
BMC Bioinformatics. 2013 Aug 21;14:254. doi: 10.1186/1471-2105-14-254.
10
Differential expression analysis for paired RNA-Seq data.差异表达分析的配对 RNA-Seq 数据。
BMC Bioinformatics. 2013 Mar 27;14:110. doi: 10.1186/1471-2105-14-110.

引用本文的文献

1
Bayesian identification of differentially expressed isoforms using a novel joint model of RNA-seq data.使用RNA测序数据的新型联合模型对差异表达的异构体进行贝叶斯识别。
PLoS Comput Biol. 2025 Jan 31;21(1):e1012750. doi: 10.1371/journal.pcbi.1012750. eCollection 2025 Jan.
2
GoM DE: interpreting structure in sequence count data with differential expression analysis allowing for grades of membership.GoM DE:利用允许成员等级的差异表达分析来解释序列计数数据中的结构。
Genome Biol. 2023 Oct 19;24(1):236. doi: 10.1186/s13059-023-03067-9.
3
A Bayesian hierarchical model for signal extraction from protein microarrays.

本文引用的文献

1
Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data.RNA测序数据差异基因表达分析方法的综合评估
Genome Biol. 2013;14(9):R95. doi: 10.1186/gb-2013-14-9-r95.
2
EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments.EBSeq:RNA-seq 实验中用于推理的经验贝叶斯层次模型。
Bioinformatics. 2013 Apr 15;29(8):1035-43. doi: 10.1093/bioinformatics/btt087. Epub 2013 Feb 21.
3
Differential analysis of gene regulation at transcript resolution with RNA-seq.基于 RNA-seq 的转录分辨率下基因调控的差异分析。
一种用于从蛋白质微阵列中提取信号的贝叶斯分层模型。
Stat Med. 2023 Apr 30;42(9):1445-1460. doi: 10.1002/sim.9680. Epub 2023 Mar 5.
4
SCRaPL: A Bayesian hierarchical framework for detecting technical associates in single cell multiomics data.SCRaPL:单细胞多组学数据中检测技术关联的贝叶斯层次框架。
PLoS Comput Biol. 2022 Jun 21;18(6):e1010163. doi: 10.1371/journal.pcbi.1010163. eCollection 2022 Jun.
5
The rise of the distributions: why non-normality is important for understanding the transcriptome and beyond.分布的兴起:为何非正态性对于理解转录组及其他方面至关重要。
Biophys Rev. 2019 Feb;11(1):89-94. doi: 10.1007/s12551-018-0494-4. Epub 2019 Jan 7.
6
A Bayesian model selection approach for identifying differentially expressed transcripts from RNA sequencing data.一种用于从RNA测序数据中识别差异表达转录本的贝叶斯模型选择方法。
J R Stat Soc Ser C Appl Stat. 2018 Jan;67(1):3-23. doi: 10.1111/rssc.12213. Epub 2017 Feb 7.
7
Getting the most out of RNA-seq data analysis.充分利用RNA测序数据分析。
PeerJ. 2015 Oct 29;3:e1360. doi: 10.7717/peerj.1360. eCollection 2015.
Nat Biotechnol. 2013 Jan;31(1):46-53. doi: 10.1038/nbt.2450. Epub 2012 Dec 9.
4
A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data.一种新的用于散布的收缩估计量可改善 RNA-seq 数据中的差异表达检测。
Biostatistics. 2013 Apr;14(2):232-43. doi: 10.1093/biostatistics/kxs033. Epub 2012 Sep 22.
5
Comprehensive molecular portraits of human breast tumours.人类乳腺肿瘤的全面分子特征图谱。
Nature. 2012 Oct 4;490(7418):61-70. doi: 10.1038/nature11412. Epub 2012 Sep 23.
6
A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis.Illumina 高通量 RNA 测序数据分析中标准化方法的综合评估。
Brief Bioinform. 2013 Nov;14(6):671-83. doi: 10.1093/bib/bbs046. Epub 2012 Sep 17.
7
Summarizing and correcting the GC content bias in high-throughput sequencing.高通量测序中 GC 含量偏倚的总结与校正。
Nucleic Acids Res. 2012 May;40(10):e72. doi: 10.1093/nar/gks001. Epub 2012 Feb 9.
8
Removing technical variability in RNA-seq data using conditional quantile normalization.使用条件分位数归一化去除 RNA-seq 数据中的技术变异性。
Biostatistics. 2012 Apr;13(2):204-16. doi: 10.1093/biostatistics/kxr054. Epub 2012 Jan 27.
9
Sparse linear modeling of next-generation mRNA sequencing (RNA-Seq) data for isoform discovery and abundance estimation.基于下一代 mRNA 测序(RNA-Seq)数据的稀疏线性建模用于发现异构体和丰度估计。
Proc Natl Acad Sci U S A. 2011 Dec 13;108(50):19867-72. doi: 10.1073/pnas.1113972108. Epub 2011 Dec 1.
10
Using Poisson mixed-effects model to quantify transcript-level gene expression in RNA-Seq.使用泊松混合效应模型来量化 RNA-Seq 中转录水平的基因表达。
Bioinformatics. 2012 Jan 1;28(1):63-8. doi: 10.1093/bioinformatics/btr616. Epub 2011 Nov 8.