贝叶斯半参数回归模型评估通路对连续和二分类临床结局的影响。

Bayesian semiparametric regression models for evaluating pathway effects on continuous and binary clinical outcomes.

机构信息

Department of Statistics, Virginia Polytechnic Institute and State University, Blacksburg, VA, U.S.A.

出版信息

Stat Med. 2012 Jul 10;31(15):1633-51. doi: 10.1002/sim.4493. Epub 2012 Mar 22.

DOI:10.1002/sim.4493

PMID:22438129

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3763871/

Abstract

Many statistical methods for microarray data analysis consider one gene at a time, and they may miss subtle changes at the single gene level. This limitation may be overcome by considering a set of genes simultaneously where the gene sets are derived from prior biological knowledge. Limited work has been carried out in the regression setting to study the effects of clinical covariates and expression levels of genes in a pathway either on a continuous or on a binary clinical outcome. Hence, we propose a Bayesian approach for identifying pathways related to both types of outcomes. We compare our Bayesian approaches with a likelihood-based approach that was developed by relating a least squares kernel machine for nonparametric pathway effect with a restricted maximum likelihood for variance components. Unlike the likelihood-based approach, the Bayesian approach allows us to directly estimate all parameters and pathway effects. It can incorporate prior knowledge into Bayesian hierarchical model formulation and makes inference by using the posterior samples without asymptotic theory. We consider several kernels (Gaussian, polynomial, and neural network kernels) to characterize gene expression effects in a pathway on clinical outcomes. Our simulation results suggest that the Bayesian approach has more accurate coverage probability than the likelihood-based approach, and this is especially so when the sample size is small compared with the number of genes being studied in a pathway. We demonstrate the usefulness of our approaches through its applications to a type II diabetes mellitus data set. Our approaches can also be applied to other settings where a large number of strongly correlated predictors are present.

摘要

许多微阵列数据分析的统计方法一次只考虑一个基因，它们可能会错过单个基因水平上的细微变化。通过同时考虑一组基因，可以克服这一局限性，其中基因集来自先前的生物学知识。在回归设置中，已经开展了有限的工作来研究临床协变量和途径中基因的表达水平对连续或二进制临床结果的影响。因此，我们提出了一种贝叶斯方法来识别与这两种结果都相关的途径。我们将我们的贝叶斯方法与基于似然的方法进行了比较，该方法通过将非参数途径效应的最小二乘核机器与方差分量的最大限制似然相关联来开发。与基于似然的方法不同，贝叶斯方法允许我们直接估计所有参数和途径效应。它可以将先验知识纳入贝叶斯层次模型的公式化中，并通过使用后验样本而无需渐近理论进行推断。我们考虑了几种核（高斯核、多项式核和神经网络核）来描述途径中基因表达对临床结果的影响。我们的模拟结果表明，贝叶斯方法的覆盖率概率比基于似然的方法更准确，尤其是当样本量与途径中研究的基因数量相比较小时。我们通过将其应用于 II 型糖尿病数据集来证明我们方法的有效性。我们的方法还可以应用于存在大量强相关预测因子的其他环境中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dfa/3763871/880bbf097ef9/nihms455579f1.jpg

相似文献

Bayesian semiparametric regression models for evaluating pathway effects on continuous and binary clinical outcomes.贝叶斯半参数回归模型评估通路对连续和二分类临床结局的影响。

Stat Med. 2012 Jul 10;31(15):1633-51. doi: 10.1002/sim.4493. Epub 2012 Mar 22.

Statistical properties on semiparametric regression for evaluating pathway effects.用于评估通路效应的半参数回归的统计特性

J Stat Plan Inference. 2013 Apr;143(4):745-763. doi: 10.1016/j.jspi.2012.09.009.

Kernel-imbedded Gaussian processes for disease classification using microarray gene expression data.使用微阵列基因表达数据的用于疾病分类的核嵌入高斯过程。

BMC Bioinformatics. 2007 Feb 28;8:67. doi: 10.1186/1471-2105-8-67.

Semiparametric Bayesian kernel survival model for evaluating pathway effects.半参数贝叶斯核生存模型用于评估途径效应。

Stat Methods Med Res. 2019 Oct-Nov;28(10-11):3301-3317. doi: 10.1177/0962280218797360. Epub 2018 Oct 5.

Semiparametric regression of multidimensional genetic pathway data: least-squares kernel machines and linear mixed models.多维遗传通路数据的半参数回归：最小二乘核机器与线性混合模型

Biometrics. 2007 Dec;63(4):1079-88. doi: 10.1111/j.1541-0420.2007.00799.x.

Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models.通过逻辑混合模型，使用逻辑核机器回归估计和检验遗传通路对疾病结局的影响。

BMC Bioinformatics. 2008 Jun 24;9:292. doi: 10.1186/1471-2105-9-292.

Part 1. Statistical Learning Methods for the Effects of Multiple Air Pollution Constituents.第1部分. 多种空气污染成分影响的统计学习方法

Res Rep Health Eff Inst. 2015 Jun(183 Pt 1-2):5-50.

Bayesian neural networks for bivariate binary data: an application to prostate cancer study.用于二元二元数据的贝叶斯神经网络：在前列腺癌研究中的应用。

Stat Med. 2005 Dec 15;24(23):3645-62. doi: 10.1002/sim.2214.

Semiparametric time varying coefficient model for matched case-crossover studies.匹配病例交叉研究的半参数时变系数模型

Stat Med. 2017 Mar 15;36(6):998-1013. doi: 10.1002/sim.7194. Epub 2016 Dec 15.

An integrative framework for Bayesian variable selection with informative priors for identifying genes and pathways.贝叶斯变量选择的综合框架，带有信息先验，用于识别基因和途径。

PLoS One. 2013 Jul 3;8(7):e67672. doi: 10.1371/journal.pone.0067672. Print 2013.

引用本文的文献

PaIRKAT: A pathway integrated regression-based kernel association test with applications to metabolomics and COPD phenotypes.PaIRKAT：一种基于通路整合回归的核关联检验方法，及其在代谢组学和 COPD 表型中的应用。

PLoS Comput Biol. 2021 Oct 22;17(10):e1008986. doi: 10.1371/journal.pcbi.1008986. eCollection 2021 Oct.

A hybrid omnibus test for generalized semiparametric single-index models with high-dimensional covariate sets.具有高维协变量集的广义半参数单指标模型的混合综合检验。

Biometrics. 2019 Sep;75(3):757-767. doi: 10.1111/biom.13054. Epub 2019 Jun 22.

Random Effects Model for Multiple Pathway Analysis with Applications to Type II Diabetes Microarray Data.用于多通路分析的随机效应模型及其在II型糖尿病微阵列数据中的应用

Stat Biosci. 2015 Oct 1;7(2):167-186. doi: 10.1007/s12561-014-9109-1. Epub 2014 Jan 30.

Stratified pathway analysis to identify gene sets associated with oral contraceptive use and breast cancer.分层通路分析以识别与口服避孕药使用和乳腺癌相关的基因集。

Cancer Inform. 2014 Dec 9;13(Suppl 4):73-8. doi: 10.4137/CIN.S13973. eCollection 2014.

Statistical properties on semiparametric regression for evaluating pathway effects.用于评估通路效应的半参数回归的统计特性

J Stat Plan Inference. 2013 Apr;143(4):745-763. doi: 10.1016/j.jspi.2012.09.009.

Kernel machine SNP-set testing under multiple candidate kernels.基于多个候选核的核机器 SNP 集检验。

Genet Epidemiol. 2013 Apr;37(3):267-75. doi: 10.1002/gepi.21715. Epub 2013 Mar 7.

本文引用的文献

INCORPORATING BIOLOGICAL INFORMATION INTO LINEAR MODELS: A BAYESIAN APPROACH TO THE SELECTION OF PATHWAYS AND GENES.将生物信息整合到线性模型中：一种选择通路和基因的贝叶斯方法。

Ann Appl Stat. 2011 Sep 1;5(3):1978-2002. doi: 10.1214/11-AOAS463.

Type 2 diabetes, impaired fasting glucose, and their association with increased hepatic enzyme levels among the employees in a university hospital in Thailand.

J Med Assoc Thai. 2009 Jul;92(7):961-8.

Association of SGK1 gene polymorphisms with type 2 diabetes.SGK1基因多态性与2型糖尿病的关联。

Cell Physiol Biochem. 2008;21(1-3):151-60. doi: 10.1159/000113757. Epub 2008 Jan 16.

Biometrics. 2007 Dec;63(4):1079-88. doi: 10.1111/j.1541-0420.2007.00799.x.

Pathway analysis using random forests classification and regression.使用随机森林分类和回归的通路分析

Bioinformatics. 2006 Aug 15;22(16):2028-36. doi: 10.1093/bioinformatics/btl344. Epub 2006 Jun 29.

Serum- and glucocorticoid-inducible kinase 1 mediates salt sensitivity of glucose tolerance.血清和糖皮质激素诱导激酶1介导糖耐量的盐敏感性。

Diabetes. 2006 Jul;55(7):2059-66. doi: 10.2337/db05-1038.

Islet autoimmunity and genetic mutations in Chinese subjects initially thought to have Type 1B diabetes.最初被认为患有1B型糖尿病的中国受试者的胰岛自身免疫和基因突变。

Diabet Med. 2006 Jan;23(1):67-71. doi: 10.1111/j.1464-5491.2005.01722.x.

Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.基因集富集分析：一种基于知识的方法用于解读全基因组表达谱。

Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50. doi: 10.1073/pnas.0506580102. Epub 2005 Sep 30.

Inferring pathways from gene lists using a literature-derived network of biological relationships.利用源自文献的生物关系网络从基因列表推断通路。

Bioinformatics. 2005 Mar;21(6):788-93. doi: 10.1093/bioinformatics/bti069. Epub 2004 Oct 27.

BagBoosting for tumor classification with gene expression data.用于基于基因表达数据的肿瘤分类的BagBoosting算法

Bioinformatics. 2004 Dec 12;20(18):3583-93. doi: 10.1093/bioinformatics/bth447. Epub 2004 Oct 5.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验