Suppr超能文献

删失数据的Meta分析中的高维变量选择

High-dimensional variable selection in meta-analysis for censored data.

作者信息

Liu Fei, Dunson David, Zou Fei

机构信息

IBM T. J. Watson Research Center, Yorktown Heights, New York 10598, USA.

出版信息

Biometrics. 2011 Jun;67(2):504-12. doi: 10.1111/j.1541-0420.2010.01466.x. Epub 2010 Aug 5.

Abstract

This article considers the problem of selecting predictors of time to an event from a high-dimensional set of candidate predictors using data from multiple studies. As an alternative to the current multistage testing approaches, we propose to model the study-to-study heterogeneity explicitly using a hierarchical model to borrow strength. Our method incorporates censored data through an accelerated failure time model. Using a carefully formulated prior specification, we develop a fast approach to predictor selection and shrinkage estimation for high-dimensional predictors. For model fitting, we develop a Monte Carlo expectation maximization (MC-EM) algorithm to accommodate censored data. The proposed approach, which is related to the relevance vector machine (RVM), relies on maximum a posteriori estimation to rapidly obtain a sparse estimate. As for the typical RVM, there is an intrinsic thresholding property in which unimportant predictors tend to have their coefficients shrunk to zero. We compare our method with some commonly used procedures through simulation studies. We also illustrate the method using the gene expression barcode data from three breast cancer studies.

摘要

本文探讨了如何利用来自多项研究的数据,从高维候选预测变量集中选择事件发生时间的预测变量这一问题。作为当前多阶段测试方法的替代方案,我们建议使用分层模型明确地对研究间的异质性进行建模,以借鉴优势。我们的方法通过加速失效时间模型纳入删失数据。利用精心制定的先验规范,我们开发了一种针对高维预测变量进行预测变量选择和收缩估计的快速方法。对于模型拟合,我们开发了一种蒙特卡罗期望最大化(MC - EM)算法来处理删失数据。所提出的方法与相关向量机(RVM)有关,它依赖于最大后验估计来快速获得稀疏估计。与典型的RVM一样,存在一种内在的阈值化特性,即不重要的预测变量往往会使其系数收缩至零。我们通过模拟研究将我们的方法与一些常用程序进行比较。我们还使用来自三项乳腺癌研究的基因表达条形码数据说明了该方法。

相似文献

1
High-dimensional variable selection in meta-analysis for censored data.
Biometrics. 2011 Jun;67(2):504-12. doi: 10.1111/j.1541-0420.2010.01466.x. Epub 2010 Aug 5.
2
A Bayesian hierarchical model for high-dimensional meta-analysis.
Methods Mol Biol. 2010;620:538-46. doi: 10.1007/978-1-60761-580-4_20.
3
Meta-analysis based variable selection for gene expression data.
Biometrics. 2014 Dec;70(4):872-80. doi: 10.1111/biom.12213. Epub 2014 Sep 5.
4
Gaussian process regression for survival time prediction with genome-wide gene expression.
Biostatistics. 2021 Jan 28;22(1):164-180. doi: 10.1093/biostatistics/kxz023.
5
Robust Model Selection and Estimation for Censored Survival Data with High Dimensional Genomic Covariates.
Acta Biotheor. 2019 Sep;67(3):225-251. doi: 10.1007/s10441-019-09349-9. Epub 2019 May 28.
6
Regression analysis of arbitrarily censored survival data under the proportional odds model.
Stat Med. 2021 Jul 20;40(16):3724-3739. doi: 10.1002/sim.8994. Epub 2021 Apr 21.
7
Maximum likelihood estimation in generalized linear models with multiple covariates subject to detection limits.
Stat Med. 2011 Sep 10;30(20):2551-61. doi: 10.1002/sim.4280. Epub 2011 Jun 28.
9
On the estimation of interval censored destructive negative binomial cure model.
Stat Med. 2023 Dec 10;42(28):5113-5134. doi: 10.1002/sim.9904. Epub 2023 Sep 14.
10
Scalable Bayesian variable selection for structured high-dimensional data.
Biometrics. 2018 Dec;74(4):1372-1382. doi: 10.1111/biom.12882. Epub 2018 May 8.

引用本文的文献

1
Meta-Analyzing Multiple Omics Data With Robust Variable Selection.
Front Genet. 2021 Jul 5;12:656826. doi: 10.3389/fgene.2021.656826. eCollection 2021.
2
Sparse meta-analysis with high-dimensional data.
Biostatistics. 2016 Apr;17(2):205-20. doi: 10.1093/biostatistics/kxv038. Epub 2015 Sep 21.
3
Estimation and selection of complex covariate effects in pooled nested case-control studies with heterogeneity.
Biostatistics. 2013 Sep;14(4):682-94. doi: 10.1093/biostatistics/kxt015. Epub 2013 Apr 30.
4
Paternal occupation and birth defects: findings from the National Birth Defects Prevention Study.
Occup Environ Med. 2012 Aug;69(8):534-42. doi: 10.1136/oemed-2011-100372. Epub 2012 Jul 9.

本文引用的文献

1
Compressive Sensing on Manifolds Using a Nonparametric Mixture of Factor Analyzers: Algorithm and Performance Bounds.
IEEE Trans Signal Process. 2010 Dec;58(12):6140-6155. doi: 10.1109/TSP.2010.2070796.
2
Meta-analysis of colorectal cancer gene expression profiling studies identifies consistently reported candidate biomarkers.
Cancer Epidemiol Biomarkers Prev. 2008 Mar;17(3):543-52. doi: 10.1158/1055-9965.EPI-07-2615.
4
A gene expression bar code for microarray data.
Nat Methods. 2007 Nov;4(11):911-3. doi: 10.1038/nmeth1102. Epub 2007 Sep 30.
5
Doubly penalized buckley-james method for survival data with high-dimensional covariates.
Biometrics. 2008 Mar;64(1):132-40. doi: 10.1111/j.1541-0420.2007.00877.x. Epub 2007 Aug 3.
6
Predicting survival from microarray data--a comparative study.
Bioinformatics. 2007 Aug 15;23(16):2080-7. doi: 10.1093/bioinformatics/btm305. Epub 2007 Jun 6.
8
Bayesian variable selection for the analysis of microarray data with censored outcomes.
Bioinformatics. 2006 Sep 15;22(18):2262-8. doi: 10.1093/bioinformatics/btl362. Epub 2006 Jul 15.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验