基于贝塔二项式分布的高通量测序数据配对的经验贝叶斯分析。

Empirical Bayesian analysis of paired high-throughput sequencing data with a beta-binomial distribution.

机构信息

Department of Plant Sciences, University of Cambridge, Downing Street, Cambridge, CB2 3EA, UK.

出版信息

BMC Bioinformatics. 2013 Apr 23;14:135. doi: 10.1186/1471-2105-14-135.

DOI:10.1186/1471-2105-14-135

PMID:23617841

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3658937/

Abstract

BACKGROUND

Pairing of samples arises naturally in many genomic experiments; for example, gene expression in tumour and normal tissue from the same patients. Methods for analysing high-throughput sequencing data from such experiments are required to identify differential expression, both within paired samples and between pairs under different experimental conditions.

RESULTS

We develop an empirical Bayesian method based on the beta-binomial distribution to model paired data from high-throughput sequencing experiments. We examine the performance of this method on simulated and real data in a variety of scenarios. Our methods are implemented as part of the RbaySeq package (versions 1.11.6 and greater) available from Bioconductor (http://www.bioconductor.org).

CONCLUSIONS

We compare our approach to alternatives based on generalised linear modelling approaches and show that our method offers significant gains in performance on simulated data. In testing on real data from oral squamous cell carcinoma patients, we discover greater enrichment of previously identified head and neck squamous cell carcinoma associated gene sets than has previously been achieved through a generalised linear modelling approach, suggesting that similar gains in performance may be found in real data. Our methods thus show real and substantial improvements in analyses of high-throughput sequencing data from paired samples.

摘要

背景

在许多基因组实验中，样本配对自然会出现；例如，来自同一患者的肿瘤和正常组织中的基因表达。需要针对此类实验的高通量测序数据开发分析方法，以识别配对样本内和不同实验条件下的样本对之间的差异表达。

结果

我们开发了一种基于贝塔二项式分布的经验贝叶斯方法来对高通量测序实验中的配对数据进行建模。我们在各种场景下模拟和真实数据上检验了该方法的性能。我们的方法作为 Bioconductor（http://www.bioconductor.org）上可用的 RbaySeq 包（版本 1.11.6 及更高版本）的一部分实现。

结论

我们将我们的方法与基于广义线性模型方法的替代方法进行了比较，并表明我们的方法在模拟数据上的性能有显著提高。在对口腔鳞状细胞癌患者的真实数据进行测试时，我们发现与以前确定的头颈部鳞状细胞癌相关基因集的富集程度比通过广义线性模型方法以前实现的要高，这表明在真实数据中可能会发现类似的性能提高。因此，我们的方法在分析高通量测序数据的配对样本方面显示出了真实而实质性的改进。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2def/3658937/a83a235d58a8/1471-2105-14-135-1.jpg

相似文献

Empirical Bayesian analysis of paired high-throughput sequencing data with a beta-binomial distribution.基于贝塔二项式分布的高通量测序数据配对的经验贝叶斯分析。

BMC Bioinformatics. 2013 Apr 23;14:135. doi: 10.1186/1471-2105-14-135.

Generalized empirical Bayesian methods for discovery of differential data in high-throughput biology.用于高通量生物学中差异数据发现的广义经验贝叶斯方法。

Bioinformatics. 2016 Jan 15;32(2):195-202. doi: 10.1093/bioinformatics/btv569. Epub 2015 Oct 1.

Somatic copy number alterations detected by ultra-deep targeted sequencing predict prognosis in oral cavity squamous cell carcinoma.通过超深度靶向测序检测到的体细胞拷贝数改变可预测口腔鳞状细胞癌的预后。

Oncotarget. 2015 Aug 14;6(23):19891-906. doi: 10.18632/oncotarget.4336.

SigFuge: single gene clustering of RNA-seq reveals differential isoform usage among cancer samples.SigFuge：RNA测序的单基因聚类揭示癌症样本中不同的异构体使用情况

Nucleic Acids Res. 2014 Aug;42(14):e113. doi: 10.1093/nar/gku521. Epub 2014 Jul 16.

baySeq: empirical Bayesian methods for identifying differential expression in sequence count data.baySeq：用于识别序列计数数据中差异表达的经验贝叶斯方法。

BMC Bioinformatics. 2010 Aug 10;11:422. doi: 10.1186/1471-2105-11-422.

puma 3.0: improved uncertainty propagation methods for gene and transcript expression analysis.puma 3.0：改进了基因和转录本表达分析的不确定性传播方法。

BMC Bioinformatics. 2013 Feb 5;14:39. doi: 10.1186/1471-2105-14-39.

Genomic alterations in head and neck squamous cell carcinoma determined by cancer gene-targeted sequencing.通过癌症基因靶向测序确定的头颈部鳞状细胞癌的基因组改变。

Ann Oncol. 2015 Jun;26(6):1216-1223. doi: 10.1093/annonc/mdv109. Epub 2015 Feb 23.

Transcriptome sequencing uncovers novel long noncoding and small nucleolar RNAs dysregulated in head and neck squamous cell carcinoma.转录组测序揭示了头颈部鳞状细胞癌中失调的新型长链非编码RNA和小核仁RNA。

RNA. 2015 Jun;21(6):1122-34. doi: 10.1261/rna.049262.114. Epub 2015 Apr 22.

Comprehensive genomic profiling of head and neck squamous cell carcinoma reveals FGFR1 amplifications and tumour genomic alterations burden as prognostic biomarkers of survival.对头颈鳞状细胞癌进行全面基因组分析，揭示 FGFR1 扩增和肿瘤基因组改变负担可作为生存预后的生物标志物。

Eur J Cancer. 2018 Mar;91:47-55. doi: 10.1016/j.ejca.2017.12.016. Epub 2018 Jan 11.

Clusterization in head and neck squamous carcinomas based on lncRNA expression: molecular and clinical correlates.基于长链非编码RNA表达的头颈部鳞状细胞癌聚类分析：分子与临床相关性

Clin Epigenetics. 2017 Apr 8;9:36. doi: 10.1186/s13148-017-0334-6. eCollection 2017.

引用本文的文献

A comparison of methods for multiple degree of freedom testing in repeated measures RNA-sequencing experiments.重复测量 RNA 测序实验中用于多次自由度检验方法的比较。

BMC Med Res Methodol. 2022 May 28;22(1):153. doi: 10.1186/s12874-022-01615-8.

Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data.贝叶斯相关性是一种用于单细胞RNA测序数据的稳健基因相似性度量。

NAR Genom Bioinform. 2020 Jan 24;2(1):lqaa002. doi: 10.1093/nargab/lqaa002. eCollection 2020 Mar.

Biomarker discovery in attention deficit hyperactivity disorder: RNA sequencing of whole blood in discordant twin and case-controlled cohorts.注意缺陷多动障碍的生物标志物发现：全血 RNA 测序在不一致双胞胎和病例对照队列中的研究。

BMC Med Genomics. 2020 Oct 28;13(1):160. doi: 10.1186/s12920-020-00808-8.

fRNAkenseq: a fully powered-by-CyVerse cloud integrated RNA-sequencing analysis tool.fRNAkenseq：一款完全由CyVerse云驱动的集成RNA测序分析工具。

PeerJ. 2020 May 14;8:e8592. doi: 10.7717/peerj.8592. eCollection 2020.

PairedFB: a full hierarchical Bayesian model for paired RNA-seq data with heterogeneous treatment effects.配对 FB：一种具有异质处理效应的配对 RNA-seq 数据的全层次贝叶斯模型。

Bioinformatics. 2019 Mar 1;35(5):787-797. doi: 10.1093/bioinformatics/bty731.

Methods for discovering genomic loci exhibiting complex patterns of differential methylation.发现呈现复杂差异甲基化模式的基因组位点的方法。

BMC Bioinformatics. 2017 Sep 18;18(1):416. doi: 10.1186/s12859-017-1836-0.

A Mechanistic Beta-Binomial Probability Model for mRNA Sequencing Data.一种用于mRNA测序数据的机理贝塔二项式概率模型。

PLoS One. 2016 Jun 21;11(6):e0157828. doi: 10.1371/journal.pone.0157828. eCollection 2016.

Mobile small RNAs regulate genome-wide DNA methylation.移动小RNA调控全基因组DNA甲基化。

Proc Natl Acad Sci U S A. 2016 Feb 9;113(6):E801-10. doi: 10.1073/pnas.1515072113. Epub 2016 Jan 19.

Evaluation of methods for differential expression analysis on multi-group RNA-seq count data.多组RNA测序计数数据差异表达分析方法的评估

BMC Bioinformatics. 2015 Nov 4;16:361. doi: 10.1186/s12859-015-0794-7.

The use of duplex-specific nuclease in ribosome profiling and a user-friendly software package for Ribo-seq data analysis.双链特异性核酸酶在核糖体谱分析中的应用以及用于核糖体测序数据分析的用户友好型软件包。

RNA. 2015 Oct;21(10):1731-45. doi: 10.1261/rna.052548.115. Epub 2015 Aug 18.

本文引用的文献

Optimizing a massive parallel sequencing workflow for quantitative miRNA expression analysis.优化大规模并行测序工作流程，用于定量 miRNA 表达分析。

PLoS One. 2012;7(2):e31630. doi: 10.1371/journal.pone.0031630. Epub 2012 Feb 20.

Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation.针对生物变异的多因素 RNA-Seq 实验的差异表达分析。

Nucleic Acids Res. 2012 May;40(10):4288-97. doi: 10.1093/nar/gks042. Epub 2012 Jan 28.

A comparison of statistical methods for detecting differentially expressed genes from RNA-seq data.基于 RNA-seq 数据的差异表达基因检测的统计学方法比较。

Am J Bot. 2012 Feb;99(2):248-56. doi: 10.3732/ajb.1100340. Epub 2012 Jan 20.

A powerful and flexible approach to the analysis of RNA sequence count data.一种强大而灵活的 RNA 序列计数数据分析方法。

Bioinformatics. 2011 Oct 1;27(19):2672-8. doi: 10.1093/bioinformatics/btr449. Epub 2011 Aug 2.

Differential expression analysis for sequence count data.差异表达分析序列计数数据。

Genome Biol. 2010;11(10):R106. doi: 10.1186/gb-2010-11-10-r106. Epub 2010 Oct 27.

baySeq: empirical Bayesian methods for identifying differential expression in sequence count data.baySeq：用于识别序列计数数据中差异表达的经验贝叶斯方法。

BMC Bioinformatics. 2010 Aug 10;11:422. doi: 10.1186/1471-2105-11-422.

A scaling normalization method for differential expression analysis of RNA-seq data.RNA-seq 数据差异表达分析的缩放标准化方法。

Genome Biol. 2010;11(3):R25. doi: 10.1186/gb-2010-11-3-r25. Epub 2010 Mar 2.

Tumor transcriptome sequencing reveals allelic expression imbalances associated with copy number alterations.肿瘤转录组测序揭示了与拷贝数改变相关的等位基因表达失衡。

PLoS One. 2010 Feb 19;5(2):e9317. doi: 10.1371/journal.pone.0009317.

Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments.mRNA-Seq 实验中标准化和差异表达的统计方法评估。

BMC Bioinformatics. 2010 Feb 18;11:94. doi: 10.1186/1471-2105-11-94.

The evolving transcriptome of head and neck squamous cell carcinoma: a systematic review.头颈部鳞状细胞癌不断演变的转录组：一项系统综述

PLoS One. 2008 Sep 15;3(9):e3215. doi: 10.1371/journal.pone.0003215.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于贝塔二项式分布的高通量测序数据配对的经验贝叶斯分析。

Empirical Bayesian analysis of paired high-throughput sequencing data with a beta-binomial distribution.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献