• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

贝叶斯非参数变量选择作为发现差异表达基因的探索性工具。

Bayesian nonparametric variable selection as an exploratory tool for discovering differentially expressed genes.

机构信息

Department of Statistics, University of California at Irvine, CA, USA.

出版信息

Stat Med. 2013 May 30;32(12):2114-26. doi: 10.1002/sim.5680. Epub 2012 Nov 22.

DOI:10.1002/sim.5680
PMID:23172736
Abstract

High-throughput scientific studies involving no clear a priori hypothesis are common. For example, a large-scale genomic study of a disease may examine thousands of genes without hypothesizing that any specific gene is responsible for the disease. In these studies, the objective is to explore a large number of possible factors (e.g., genes) in order to identify a small number that will be considered in follow-up studies that tend to be more thorough and on smaller scales. A simple, hierarchical, linear regression model with random coefficients is assumed for case-control data that correspond to each gene. The specific model used will be seen to be related to a standard Bayesian variable selection model. Relatively large regression coefficients correspond to potential differences in responses for cases versus controls and thus to genes that might 'matter'. For large-scale studies, and using a Dirichlet process mixture model for the regression coefficients, we are able to find clusters of regression effects of genes with increasing potential effect or 'relevance', in relation to the outcome of interest. One cluster will always correspond to genes whose coefficients are in a neighborhood that is relatively close to zero and will be deemed least relevant. Other clusters will correspond to increasing magnitudes of the random/latent regression coefficients. Using simulated data, we demonstrate that our approach could be quite effective in finding relevant genes compared with several alternative methods. We apply our model to two large-scale studies. The first study involves transcriptome analysis of infection by human cytomegalovirus. The second study's objective is to identify differentially expressed genes between two types of leukemia.

摘要

高通量科学研究通常不涉及明确的先验假设。例如,一项大规模的疾病基因组研究可能会检测数千个基因,而不假设任何特定的基因是导致该疾病的原因。在这些研究中,目的是探索大量可能的因素(例如基因),以便确定少数将在后续研究中考虑的因素,这些后续研究往往更深入、规模更小。假设病例对照数据与每个基因相对应的是具有随机系数的简单层次线性回归模型。使用的具体模型将与标准贝叶斯变量选择模型相关。相对较大的回归系数对应于病例与对照之间的潜在反应差异,因此对应于可能“重要”的基因。对于大规模研究,并使用回归系数的狄利克雷过程混合模型,我们能够找到与感兴趣的结果相关的基因回归效应的聚类,这些聚类的潜在效应或“相关性”逐渐增加。一个聚类将始终对应于系数处于相对接近零的邻域的基因,并且被认为最不相关。其他聚类将对应于随机/潜在回归系数的幅度增加。使用模拟数据,我们表明与几种替代方法相比,我们的方法在发现相关基因方面可能非常有效。我们将模型应用于两项大规模研究。第一项研究涉及人类巨细胞病毒感染的转录组分析。第二项研究的目的是识别两种类型白血病之间差异表达的基因。

相似文献

1
Bayesian nonparametric variable selection as an exploratory tool for discovering differentially expressed genes.贝叶斯非参数变量选择作为发现差异表达基因的探索性工具。
Stat Med. 2013 May 30;32(12):2114-26. doi: 10.1002/sim.5680. Epub 2012 Nov 22.
2
Part 1. Statistical Learning Methods for the Effects of Multiple Air Pollution Constituents.第1部分. 多种空气污染成分影响的统计学习方法
Res Rep Health Eff Inst. 2015 Jun(183 Pt 1-2):5-50.
3
Empirical Bayes ranking and selection methods via semiparametric hierarchical mixture models in microarray studies.基于半参数层次混合模型的微阵列研究中经验贝叶斯排名和选择方法。
Stat Med. 2013 May 20;32(11):1904-16. doi: 10.1002/sim.5718. Epub 2012 Dec 28.
4
Hierarchical Bayesian formulations for selecting variables in regression models.回归模型中变量选择的分层贝叶斯公式。
Stat Med. 2012 May 20;31(11-12):1221-37. doi: 10.1002/sim.4439. Epub 2012 Jan 25.
5
Gene selection: a Bayesian variable selection approach.基因选择:一种贝叶斯变量选择方法。
Bioinformatics. 2003 Jan;19(1):90-7. doi: 10.1093/bioinformatics/19.1.90.
6
Bayesian variable selection for the analysis of microarray data with censored outcomes.用于分析具有删失结局的微阵列数据的贝叶斯变量选择
Bioinformatics. 2006 Sep 15;22(18):2262-8. doi: 10.1093/bioinformatics/btl362. Epub 2006 Jul 15.
7
A stable iterative method for refining discriminative gene clusters.一种用于优化鉴别性基因簇的稳定迭代方法。
BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S18. doi: 10.1186/1471-2164-9-S2-S18.
8
A marginal mixture model for selecting differentially expressed genes across two types of tissue samples.一种用于在两种类型的组织样本中选择差异表达基因的边缘混合模型。
Int J Biostat. 2008 Oct 9;4(1):Article 20. doi: 10.2202/1557-4679.1093.
9
A sub-space greedy search method for efficient Bayesian Network inference.一种用于高效贝叶斯网络推理的子空间贪婪搜索方法。
Comput Biol Med. 2011 Sep;41(9):763-70. doi: 10.1016/j.compbiomed.2011.06.012. Epub 2011 Jul 8.
10
[Meta-analysis of the Italian studies on short-term effects of air pollution].[意大利关于空气污染短期影响研究的荟萃分析]
Epidemiol Prev. 2001 Mar-Apr;25(2 Suppl):1-71.

引用本文的文献

1
Bayesian Hidden Markov Models for Dependent Large-Scale Multiple Testing.用于相关大规模多重检验的贝叶斯隐马尔可夫模型
Comput Stat Data Anal. 2019 Aug;136:123-136. doi: 10.1016/j.csda.2019.01.009. Epub 2019 Jan 29.
2
Comparing Objective and Subjective Bayes Factors for the Two-Sample Comparison: The Classification Theorem in Action.两样本比较中客观贝叶斯因子与主观贝叶斯因子的比较:分类定理的应用
Am Stat. 2019;73(1):22-31. doi: 10.1080/00031305.2017.1322142. Epub 2018 May 10.