• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用SAGE对mRNA转录本相对丰度进行贝叶斯收缩估计。

Bayesian shrinkage estimation of the relative abundance of mRNA transcripts using SAGE.

作者信息

Morris Jeffrey S, Baggerly Keith A, Coombes Kevin R

机构信息

Department of Biostatistics, University of Texas, M. D. Anderson Cancer Center, 1515 Holcombe Blvd., Box 447, Houston, Texas 77030-4009, USA.

出版信息

Biometrics. 2003 Sep;59(3):476-86. doi: 10.1111/1541-0420.00057.

DOI:10.1111/1541-0420.00057
PMID:14601748
Abstract

Serial analysis of gene expression (SAGE) is a technology for quantifying gene expression in biological tissue that yields count data that can be modeled by a multinomial distribution with two characteristics: skewness in the relative frequencies and small sample size relative to the dimension. As a result of these characteristics, a given SAGE sample may fail to capture a large number of expressed mRNA species present in the tissue. Empirical estimators of mRNA species' relative abundance effectively ignore these missing species, and as a result tend to overestimate the abundance of the scarce observed species comprising a vast majority of the total. We have developed a new Bayesian estimation procedure that quantifies our prior information about these characteristics, yielding a nonlinear shrinkage estimator with efficiency advantages over the MLE. Our prior is mixture of Dirichlets, whereby species are stochastically partitioned into abundant and scarce classes, each with its own multivariate prior. Simulation studies reveal our estimator has lower integrated mean squared error (IMSE) than the MLE for the SAGE scenarios simulated, and yields relative abundance profiles closer in Euclidean distance to the truth for all samples simulated. We apply our method to a SAGE library of normal colon tissue, and discuss its implications for assessing differential expression.

摘要

基因表达序列分析(SAGE)是一种用于定量生物组织中基因表达的技术,它产生的计数数据可以用具有两个特征的多项分布来建模:相对频率的偏度和相对于维度的小样本量。由于这些特征,给定的SAGE样本可能无法捕获组织中存在的大量表达的mRNA种类。mRNA种类相对丰度的经验估计器有效地忽略了这些缺失的种类,结果往往高估了构成总数绝大多数的稀缺观察种类的丰度。我们开发了一种新的贝叶斯估计程序,该程序量化了我们关于这些特征的先验信息,产生了一种比最大似然估计(MLE)具有效率优势的非线性收缩估计器。我们的先验是狄利克雷混合,据此将种类随机分为丰富类和稀缺类,每类都有自己的多变量先验。模拟研究表明,对于模拟的SAGE场景,我们的估计器比MLE具有更低的积分均方误差(IMSE),并且对于所有模拟样本,在欧几里得距离上产生的相对丰度分布更接近真实情况。我们将我们的方法应用于正常结肠组织的SAGE文库,并讨论其对评估差异表达的意义。

相似文献

1
Bayesian shrinkage estimation of the relative abundance of mRNA transcripts using SAGE.使用SAGE对mRNA转录本相对丰度进行贝叶斯收缩估计。
Biometrics. 2003 Sep;59(3):476-86. doi: 10.1111/1541-0420.00057.
2
Shrinkage estimation of effect sizes as an alternative to hypothesis testing followed by estimation in high-dimensional biology: applications to differential gene expression.作为高维生物学中假设检验后进行估计的替代方法的效应量收缩估计:在差异基因表达中的应用
Stat Appl Genet Mol Biol. 2010;9:Article23. doi: 10.2202/1544-6115.1504. Epub 2010 Jun 8.
3
Estimators of the local false discovery rate designed for small numbers of tests.为少量检验设计的局部错误发现率估计器。
Stat Appl Genet Mol Biol. 2012 Oct 12;11(5):4. doi: 10.1515/1544-6115.1807.
4
Modeling SAGE tag formation and its effects on data interpretation within a Bayesian framework.在贝叶斯框架内对SAGE标签形成及其对数据解释的影响进行建模。
BMC Bioinformatics. 2007 Oct 18;8:403. doi: 10.1186/1471-2105-8-403.
5
An Empirical Bayes Approach to Shrinkage Estimation on the Manifold of Symmetric Positive-Definite Matrices.一种基于经验贝叶斯方法的对称正定矩阵流形上的收缩估计
J Am Stat Assoc. 2024;119(545):259-272. doi: 10.1080/01621459.2022.2110877. Epub 2022 Sep 27.
6
Small-sample estimation of negative binomial dispersion, with applications to SAGE data.负二项分布离散度的小样本估计及其在SAGE数据中的应用
Biostatistics. 2008 Apr;9(2):321-32. doi: 10.1093/biostatistics/kxm030. Epub 2007 Aug 29.
7
Statistical modeling of sequencing errors in SAGE libraries.SAGE文库中测序错误的统计建模
Bioinformatics. 2004 Aug 4;20 Suppl 1:i31-9. doi: 10.1093/bioinformatics/bth924.
8
[Transcriptomes for serial analysis of gene expression].[用于基因表达序列分析的转录组]
J Soc Biol. 2002;196(4):303-7.
9
Empirical Bayes estimation of posterior probabilities of enrichment: a comparative study of five estimators of the local false discovery rate.经验贝叶斯估计富集后验概率:局部错误发现率五个估计量的比较研究。
BMC Bioinformatics. 2013 Mar 6;14:87. doi: 10.1186/1471-2105-14-87.
10
Modeling and analysis of multi-library, multi-group SAGE data with application to a study of mouse cerebellum.
Biometrics. 2007 Sep;63(3):777-86. doi: 10.1111/j.1541-0420.2006.00733.x.

引用本文的文献

1
A Bayesian Semi-parametric Approach for the Differential Analysis of Sequence Counts Data.一种用于序列计数数据差异分析的贝叶斯半参数方法。
J R Stat Soc Ser C Appl Stat. 2014 Apr;63(3):385-404. doi: 10.1111/rssc.12041.
2
Bayesian Nonparametric Inference - Why and How.贝叶斯非参数推断——为何及如何进行
Bayesian Anal. 2013;8(2). doi: 10.1214/13-BA811.
3
Estimating species richness by a Poisson-compound gamma model.用泊松复合伽马模型估计物种丰富度。
Biometrika. 2010 Sep;97(3):727-740. doi: 10.1093/biomet/asq026. Epub 2010 Jun 22.
4
Bayesian hierarchical modeling and selection of differentially expressed genes for the EST data.用于EST数据的贝叶斯层次建模与差异表达基因的选择
Biometrics. 2011 Mar;67(1):142-50. doi: 10.1111/j.1541-0420.2010.01447.x.
5
Bias correction and Bayesian analysis of aggregate counts in SAGE libraries.SAGE 文库中聚合计数的偏差校正和贝叶斯分析。
BMC Bioinformatics. 2010 Feb 3;11:72. doi: 10.1186/1471-2105-11-72.
6
Modeling transcriptome based on transcript-sampling data.基于转录本抽样数据的转录组建模。
PLoS One. 2008 Feb 20;3(2):e1659. doi: 10.1371/journal.pone.0001659.
7
Modeling SAGE tag formation and its effects on data interpretation within a Bayesian framework.在贝叶斯框架内对SAGE标签形成及其对数据解释的影响进行建模。
BMC Bioinformatics. 2007 Oct 18;8:403. doi: 10.1186/1471-2105-8-403.
8
Modeling Sage data with a truncated gamma-Poisson model.使用截断伽马-泊松模型对Sage数据进行建模。
BMC Bioinformatics. 2006 Mar 20;7:157. doi: 10.1186/1471-2105-7-157.
9
Bayesian model accounting for within-class biological variability in Serial Analysis of Gene Expression (SAGE).考虑基因表达序列分析(SAGE)中类内生物学变异性的贝叶斯模型。
BMC Bioinformatics. 2004 Aug 31;5:119. doi: 10.1186/1471-2105-5-119.