• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

贝叶斯序列计数数据分析。

Bayesian Correlation Analysis for Sequence Count Data.

机构信息

Regenerative Medicine Program, Ottawa Hospital Research Institute, Ottawa, Ontario, Canada.

Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada.

出版信息

PLoS One. 2016 Oct 4;11(10):e0163595. doi: 10.1371/journal.pone.0163595. eCollection 2016.

DOI:10.1371/journal.pone.0163595
PMID:27701449
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5049778/
Abstract

Evaluating the similarity of different measured variables is a fundamental task of statistics, and a key part of many bioinformatics algorithms. Here we propose a Bayesian scheme for estimating the correlation between different entities' measurements based on high-throughput sequencing data. These entities could be different genes or miRNAs whose expression is measured by RNA-seq, different transcription factors or histone marks whose expression is measured by ChIP-seq, or even combinations of different types of entities. Our Bayesian formulation accounts for both measured signal levels and uncertainty in those levels, due to varying sequencing depth in different experiments and to varying absolute levels of individual entities, both of which affect the precision of the measurements. In comparison with a traditional Pearson correlation analysis, we show that our Bayesian correlation analysis retains high correlations when measurement confidence is high, but suppresses correlations when measurement confidence is low-especially for entities with low signal levels. In addition, we consider the influence of priors on the Bayesian correlation estimate. Perhaps surprisingly, we show that naive, uniform priors on entities' signal levels can lead to highly biased correlation estimates, particularly when different experiments have widely varying sequencing depths. However, we propose two alternative priors that provably mitigate this problem. We also prove that, like traditional Pearson correlation, our Bayesian correlation calculation constitutes a kernel in the machine learning sense, and thus can be used as a similarity measure in any kernel-based machine learning algorithm. We demonstrate our approach on two RNA-seq datasets and one miRNA-seq dataset.

摘要

评估不同测量变量之间的相似性是统计学的一项基本任务,也是许多生物信息学算法的关键部分。在这里,我们提出了一种基于高通量测序数据的贝叶斯方案,用于估计不同实体测量值之间的相关性。这些实体可以是不同基因或 miRNA 的表达水平,这些表达水平可以通过 RNA-seq 来测量;也可以是不同转录因子或组蛋白标记的表达水平,这些表达水平可以通过 ChIP-seq 来测量;甚至可以是不同类型的实体的组合。我们的贝叶斯公式既考虑了测量信号水平,也考虑了这些水平的不确定性,因为不同实验中的测序深度不同,以及个体实体的绝对水平也不同,这两者都会影响测量的精度。与传统的皮尔逊相关分析相比,我们表明,当测量置信度高时,我们的贝叶斯相关分析保留了高度的相关性,但当测量置信度低时,它会抑制相关性——特别是对于信号水平低的实体。此外,我们还考虑了先验对贝叶斯相关估计的影响。也许令人惊讶的是,我们表明,对实体信号水平的朴素、均匀先验会导致高度有偏的相关估计,尤其是当不同实验的测序深度差异很大时。然而,我们提出了两种替代的先验,证明可以解决这个问题。我们还证明,与传统的皮尔逊相关一样,我们的贝叶斯相关计算在机器学习意义上构成了一个核,因此可以作为任何基于核的机器学习算法中的相似性度量。我们在两个 RNA-seq 数据集和一个 miRNA-seq 数据集上展示了我们的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4aa3/5049778/d2256436b83b/pone.0163595.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4aa3/5049778/3950be4f3328/pone.0163595.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4aa3/5049778/276da2d83214/pone.0163595.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4aa3/5049778/faa248417108/pone.0163595.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4aa3/5049778/df924a8ad86c/pone.0163595.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4aa3/5049778/b0b8c589b540/pone.0163595.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4aa3/5049778/d2256436b83b/pone.0163595.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4aa3/5049778/3950be4f3328/pone.0163595.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4aa3/5049778/276da2d83214/pone.0163595.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4aa3/5049778/faa248417108/pone.0163595.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4aa3/5049778/df924a8ad86c/pone.0163595.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4aa3/5049778/b0b8c589b540/pone.0163595.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4aa3/5049778/d2256436b83b/pone.0163595.g006.jpg

相似文献

1
Bayesian Correlation Analysis for Sequence Count Data.贝叶斯序列计数数据分析。
PLoS One. 2016 Oct 4;11(10):e0163595. doi: 10.1371/journal.pone.0163595. eCollection 2016.
2
Integrative analysis of histone ChIP-seq and transcription data using Bayesian mixture models.使用贝叶斯混合模型对组蛋白ChIP-seq和转录数据进行综合分析。
Bioinformatics. 2014 Apr 15;30(8):1154-1162. doi: 10.1093/bioinformatics/btu003. Epub 2014 Jan 7.
3
NPEBseq: nonparametric empirical bayesian-based procedure for differential expression analysis of RNA-seq data.NPEBseq:一种基于非参数经验贝叶斯的 RNA-seq 数据差异表达分析方法。
BMC Bioinformatics. 2013 Aug 27;14:262. doi: 10.1186/1471-2105-14-262.
4
Detecting Multivariate Gene Interactions in RNA-Seq Data Using Optimal Bayesian Classification.基于最优贝叶斯分类的 RNA-Seq 数据中多变量基因交互作用检测。
IEEE/ACM Trans Comput Biol Bioinform. 2018 Mar-Apr;15(2):484-493. doi: 10.1109/TCBB.2015.2485223. Epub 2015 Oct 1.
5
Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data.贝叶斯相关性是一种用于单细胞RNA测序数据的稳健基因相似性度量。
NAR Genom Bioinform. 2020 Jan 24;2(1):lqaa002. doi: 10.1093/nargab/lqaa002. eCollection 2020 Mar.
6
A multitask clustering approach for single-cell RNA-seq analysis in Recessive Dystrophic Epidermolysis Bullosa.一种用于隐性营养不良型大疱性表皮松解症的单细胞 RNA-seq 分析的多任务聚类方法。
PLoS Comput Biol. 2018 Apr 9;14(4):e1006053. doi: 10.1371/journal.pcbi.1006053. eCollection 2018 Apr.
7
Uncovering robust patterns of microRNA co-expression across cancers using Bayesian Relevance Networks.使用贝叶斯相关网络揭示跨癌症的稳健微小RNA共表达模式。
PLoS One. 2017 Aug 17;12(8):e0183103. doi: 10.1371/journal.pone.0183103. eCollection 2017.
8
Correlation between RNA-Seq and microarrays results using TCGA data.使用TCGA数据的RNA测序与微阵列结果之间的相关性。
Gene. 2017 Sep 10;628:200-204. doi: 10.1016/j.gene.2017.07.056. Epub 2017 Jul 20.
9
Differential correlation for sequencing data.测序数据的差异相关性
BMC Res Notes. 2017 Jan 19;10(1):54. doi: 10.1186/s13104-016-2331-9.
10
BioVLAB-MMIA-NGS: microRNA-mRNA integrated analysis using high-throughput sequencing data.BioVLAB-MMIA-NGS:利用高通量测序数据进行的微小RNA-信使核糖核酸整合分析
Bioinformatics. 2015 Jan 15;31(2):265-7. doi: 10.1093/bioinformatics/btu614. Epub 2014 Sep 29.

引用本文的文献

1
Meta-analysis defines principles for the design and analysis of co-fractionation mass spectrometry experiments.元分析为共馏分质谱实验的设计和分析定义了原则。
Nat Methods. 2021 Jul;18(7):806-815. doi: 10.1038/s41592-021-01194-4. Epub 2021 Jul 1.
2
Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data.贝叶斯相关性是一种用于单细胞RNA测序数据的稳健基因相似性度量。
NAR Genom Bioinform. 2020 Jan 24;2(1):lqaa002. doi: 10.1093/nargab/lqaa002. eCollection 2020 Mar.
3
BraInMap Elucidates the Macromolecular Connectivity Landscape of Mammalian Brain.

本文引用的文献

1
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.使用DESeq2对RNA测序数据的倍数变化和离散度进行适度估计。
Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8.
2
Developmental transcriptome analysis of human erythropoiesis.人类红细胞生成的发育转录组分析。
Hum Mol Genet. 2014 Sep 1;23(17):4528-42. doi: 10.1093/hmg/ddu167. Epub 2014 Apr 29.
3
The Cancer Genome Atlas Pan-Cancer analysis project.癌症基因组图谱泛癌分析项目。
脑图谱描绘出哺乳动物大脑的高分子连通景观。
Cell Syst. 2020 Apr 22;10(4):333-350.e14. doi: 10.1016/j.cels.2020.03.003.
4
EPIC: software toolkit for elution profile-based inference of protein complexes.EPIC:基于洗脱轮廓的蛋白质复合物推断的软件工具包。
Nat Methods. 2019 Aug;16(8):737-742. doi: 10.1038/s41592-019-0461-4. Epub 2019 Jul 15.
5
Uncovering robust patterns of microRNA co-expression across cancers using Bayesian Relevance Networks.使用贝叶斯相关网络揭示跨癌症的稳健微小RNA共表达模式。
PLoS One. 2017 Aug 17;12(8):e0183103. doi: 10.1371/journal.pone.0183103. eCollection 2017.
Nat Genet. 2013 Oct;45(10):1113-20. doi: 10.1038/ng.2764.
4
EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments.EBSeq:RNA-seq 实验中用于推理的经验贝叶斯层次模型。
Bioinformatics. 2013 Apr 15;29(8):1035-43. doi: 10.1093/bioinformatics/btt087. Epub 2013 Feb 21.
5
Differential analysis of gene regulation at transcript resolution with RNA-seq.基于 RNA-seq 的转录分辨率下基因调控的差异分析。
Nat Biotechnol. 2013 Jan;31(1):46-53. doi: 10.1038/nbt.2450. Epub 2012 Dec 9.
6
ReCount: a multi-experiment resource of analysis-ready RNA-seq gene count datasets.ReCount:一个可分析的 RNA-seq 基因计数数据集的多实验资源。
BMC Bioinformatics. 2011 Nov 16;12:449. doi: 10.1186/1471-2105-12-449.
7
A user's guide to the encyclopedia of DNA elements (ENCODE).DNA 元件百科全书(ENCODE)使用指南
PLoS Biol. 2011 Apr;9(4):e1001046. doi: 10.1371/journal.pbio.1001046. Epub 2011 Apr 19.
8
Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1.整合基因组分析确定了具有 PDGFRA、IDH1、EGFR 和 NF1 异常的胶质母细胞瘤的临床相关亚型。
Cancer Cell. 2010 Jan 19;17(1):98-110. doi: 10.1016/j.ccr.2009.12.020.
9
Unlocking the secrets of the genome.揭开基因组的秘密。
Nature. 2009 Jun 18;459(7249):927-30. doi: 10.1038/459927a.
10
Next-generation sequencing: from basic research to diagnostics.下一代测序:从基础研究到诊断
Clin Chem. 2009 Apr;55(4):641-58. doi: 10.1373/clinchem.2008.112789. Epub 2009 Feb 26.