病例-队列研究分析中的变量选择两步法。

A two-step method for variable selection in the analysis of a case-cohort study.

机构信息

MRC Biostatistics Unit, Cambridge, UK.

MRC Epidemiology Unit, Cambridge, UK.

出版信息

Int J Epidemiol. 2018 Apr 1;47(2):597-604. doi: 10.1093/ije/dyx224.

DOI:10.1093/ije/dyx224

PMID:29136145

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5913627/

Abstract

BACKGROUND

Accurate detection and estimation of true exposure-outcome associations is important in aetiological analysis; when there are multiple potential exposure variables of interest, methods for detecting the subset of variables most likely to have true associations with the outcome of interest are required. Case-cohort studies often collect data on a large number of variables which have not been measured in the entire cohort (e.g. panels of biomarkers). There is a lack of guidance on methods for variable selection in case-cohort studies.

METHODS

We describe and explore the application of three variable selection methods to data from a case-cohort study. These are: (i) selecting variables based on their level of significance in univariable (i.e. one-at-a-time) Prentice-weighted Cox regression models; (ii) stepwise selection applied to Prentice-weighted Cox regression; and (iii) a two-step method which applies a Bayesian variable selection algorithm to obtain posterior probabilities of selection for each variable using multivariable logistic regression followed by effect estimation using Prentice-weighted Cox regression.

RESULTS

Across nine different simulation scenarios, the two-step method demonstrated higher sensitivity and lower false discovery rate than the one-at-a-time and stepwise methods. In an application of the methods to data from the EPIC-InterAct case-cohort study, the two-step method identified an additional two fatty acids as being associated with incident type 2 diabetes, compared with the one-at-a-time and stepwise methods.

CONCLUSIONS

The two-step method enables more powerful and accurate detection of exposure-outcome associations in case-cohort studies. An R package is available to enable researchers to apply this method.

摘要

背景

在病因分析中，准确检测和估计真实的暴露-结局关联非常重要；当存在多个潜在的感兴趣的暴露变量时，需要使用检测与感兴趣的结局有真实关联的变量子集的方法。病例-队列研究通常会收集大量未在整个队列中测量的变量的数据（例如生物标志物面板）。对于病例-队列研究中的变量选择方法，缺乏指导。

方法

我们描述并探讨了三种变量选择方法在病例-队列研究数据中的应用。这些方法是：（i）基于单变量（即逐一变量）Prentice 加权 Cox 回归模型中变量的显著性选择变量；（ii）应用于 Prentice 加权 Cox 回归的逐步选择；（iii）一种两步法，使用多变量逻辑回归获得每个变量的选择后验概率，然后使用 Prentice 加权 Cox 回归进行效应估计，从而应用贝叶斯变量选择算法。

结果

在九个不同的模拟场景中，两步法的敏感性高于逐一变量和逐步法，假发现率较低。在对 EPIC-InterAct 病例-队列研究数据的方法应用中，与逐一变量和逐步法相比，两步法确定了另外两种脂肪酸与 2 型糖尿病的发病有关。

结论

两步法能够更有力、更准确地检测病例-队列研究中的暴露-结局关联。提供了一个 R 包，使研究人员能够应用这种方法。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

病例-队列研究分析中的变量选择两步法。

A two-step method for variable selection in the analysis of a case-cohort study.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

病例-队列研究分析中的变量选择两步法。

A two-step method for variable selection in the analysis of a case-cohort study.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献