用于分层病例队列研究风险预测的核机器测试

Kernel machine testing for risk prediction with stratified case cohort studies.

作者信息

Payne Rebecca, Neykov Matey, Jensen Majken Karoline, Cai Tianxi

机构信息

Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts 02115, U.S.A.

出版信息

Biometrics. 2016 Jun;72(2):372-81. doi: 10.1111/biom.12452. Epub 2015 Dec 21.

DOI:10.1111/biom.12452

PMID:26692376

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4899160/

Abstract

Large assembled cohorts with banked biospecimens offer valuable opportunities to identify novel markers for risk prediction. When the outcome of interest is rare, an effective strategy to conserve limited biological resources while maintaining reasonable statistical power is the case cohort (CCH) sampling design, in which expensive markers are measured on a subset of cases and controls. However, the CCH design introduces significant analytical complexity due to outcome-dependent, finite-population sampling. Current methods for analyzing CCH studies focus primarily on the estimation of simple survival models with linear effects; testing and estimation procedures that can efficiently capture complex non-linear marker effects for CCH data remain elusive. In this article, we propose inverse probability weighted (IPW) variance component type tests for identifying important marker sets through a Cox proportional hazards kernel machine (CoxKM) regression framework previously considered for full cohort studies (Cai et al., 2011). The optimal choice of kernel, while vitally important to attain high power, is typically unknown for a given dataset. Thus, we also develop robust testing procedures that adaptively combine information from multiple kernels. The proposed IPW test statistics have complex null distributions that cannot easily be approximated explicitly. Furthermore, due to the correlation induced by CCH sampling, standard resampling methods such as the bootstrap fail to approximate the distribution correctly. We, therefore, propose a novel perturbation resampling scheme that can effectively recover the induced correlation structure. Results from extensive simulation studies suggest that the proposed IPW CoxKM testing procedures work well in finite samples. The proposed methods are further illustrated by application to a Danish CCH study of Apolipoprotein C-III markers on the risk of coronary heart disease.

摘要

拥有生物样本库的大型队列集合为识别风险预测的新型标志物提供了宝贵机会。当感兴趣的结局罕见时，一种在保持合理统计功效的同时节约有限生物资源的有效策略是病例队列（CCH）抽样设计，即在病例和对照的一个子集中测量昂贵的标志物。然而，由于结局依赖的有限总体抽样，CCH设计引入了显著的分析复杂性。当前分析CCH研究的方法主要集中在具有线性效应的简单生存模型的估计上；能够有效捕捉CCH数据复杂非线性标志物效应的检验和估计程序仍然难以捉摸。在本文中，我们提出了逆概率加权（IPW）方差分量类型检验，通过先前用于全队列研究的Cox比例风险核机器（CoxKM）回归框架来识别重要的标志物集（Cai等人，2011年）。核的最佳选择虽然对于获得高功效至关重要，但对于给定数据集通常是未知的。因此，我们还开发了稳健的检验程序，可自适应地组合来自多个核的信息。所提出的IPW检验统计量具有复杂的零分布，难以轻易明确近似。此外，由于CCH抽样引起的相关性，诸如自助法等标准重抽样方法无法正确近似分布。因此，我们提出了一种新颖的扰动重抽样方案，它可以有效地恢复诱导的相关结构。广泛模拟研究的结果表明，所提出的IPW CoxKM检验程序在有限样本中效果良好。通过应用于丹麦关于载脂蛋白C-III标志物对冠心病风险的CCH研究，进一步说明了所提出的方法。

相似文献

Kernel machine testing for risk prediction with stratified case cohort studies.

Biometrics. 2016 Jun;72(2):372-81. doi: 10.1111/biom.12452. Epub 2015 Dec 21.

Robust risk prediction with biomarkers under two-phase stratified cohort design.

Biometrics. 2016 Dec;72(4):1037-1045. doi: 10.1111/biom.12515. Epub 2016 Apr 1.

Omnibus risk assessment via accelerated failure time kernel machine modeling.

Biometrics. 2013 Dec;69(4):861-73. doi: 10.1111/biom.12098. Epub 2013 Nov 6.

Kernel machine approach to testing the significance of multiple genetic markers for risk prediction.

Biometrics. 2011 Sep;67(3):975-86. doi: 10.1111/j.1541-0420.2010.01544.x. Epub 2011 Jan 31.

Resampling Procedures for Making Inference under Nested Case-control Studies.

J Am Stat Assoc. 2013 Jan 1;108(504):1532-1544. doi: 10.1080/01621459.2013.856715.

Pathway aggregation for survival prediction via multiple kernel learning.

Stat Med. 2018 Jul 20;37(16):2501-2515. doi: 10.1002/sim.7681. Epub 2018 Apr 17.

Biomarker evaluation under imperfect nested case-control design.

Stat Med. 2021 Aug 15;40(18):4035-4052. doi: 10.1002/sim.9012. Epub 2021 Apr 29.

Identifying genetic marker sets associated with phenotypes via an efficient adaptive score test.

Biostatistics. 2012 Sep;13(4):776-90. doi: 10.1093/biostatistics/kxs015. Epub 2012 Jun 25.

Retrospective likelihood-based methods for analyzing case-cohort genetic association studies.

Biometrics. 2015 Dec;71(4):960-8. doi: 10.1111/biom.12342. Epub 2015 Jul 14.

Analysis of multiple survival events in generalized case-cohort designs.

Biometrics. 2018 Dec;74(4):1250-1260. doi: 10.1111/biom.12923. Epub 2018 Jul 10.

引用本文的文献

Plasma CD36 and Incident Diabetes: A Case-Cohort Study in Danish Men and Women.

Diabetes Metab J. 2020 Feb;44(1):134-142. doi: 10.4093/dmj.2018.0273. Epub 2019 Oct 18.

本文引用的文献

Bootstrap for the case-cohort design.

Biometrika. 2014 Jun;101(2):465-476. doi: 10.1093/biomet/asu004.

Omnibus risk assessment via accelerated failure time kernel machine modeling.

Biometrics. 2013 Dec;69(4):861-73. doi: 10.1111/biom.12098. Epub 2013 Nov 6.

Evaluating the predictive value of biomarkers with stratified case-cohort design.

Biometrics. 2012 Dec;68(4):1219-27. doi: 10.1111/j.1541-0420.2012.01787.x. Epub 2012 Nov 22.

Apolipoprotein C-III as a Potential Modulator of the Association Between HDL-Cholesterol and Incident Coronary Heart Disease.

J Am Heart Assoc. 2012 Apr;1(2). doi: 10.1161/JAHA.111.000232. Epub 2012 Apr 24.

Kernel machine approach to testing the significance of multiple genetic markers for risk prediction.

Biometrics. 2011 Sep;67(3):975-86. doi: 10.1111/j.1541-0420.2010.01544.x. Epub 2011 Jan 31.

A Z-theorem with Estimated Nuisance Parameters and Correction Note for 'Weighted Likelihood for Semiparametric Models and Two-phase Stratified Samples, with Application to Cox Regression'.

Scand Stat Theory Appl. 2008 Mar 1;35(1):186-192. doi: 10.1111/j.1467-9469.2007.00574.x.

Weighted analyses for cohort sampling designs.

Lifetime Data Anal. 2009 Mar;15(1):24-40. doi: 10.1007/s10985-008-9095-z. Epub 2008 Aug 19.

Obesity, behavioral lifestyle factors, and risk of acute coronary events.

Circulation. 2008 Jun 17;117(24):3062-9. doi: 10.1161/CIRCULATIONAHA.107.759951. Epub 2008 Jun 9.

A prospective evaluation of insulin and insulin-like growth factor-I as risk factors for endometrial cancer.

Cancer Epidemiol Biomarkers Prev. 2008 Apr;17(4):921-9. doi: 10.1158/1055-9965.EPI-07-2686.

Study design, exposure variables, and socioeconomic determinants of participation in Diet, Cancer and Health: a population-based prospective cohort study of 57,053 men and women in Denmark.

Scand J Public Health. 2007;35(4):432-41. doi: 10.1080/14034940601047986.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于分层病例队列研究风险预测的核机器测试

Kernel machine testing for risk prediction with stratified case cohort studies.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献