• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

病例-队列研究分析中的变量选择两步法。

A two-step method for variable selection in the analysis of a case-cohort study.

机构信息

MRC Biostatistics Unit, Cambridge, UK.

MRC Epidemiology Unit, Cambridge, UK.

出版信息

Int J Epidemiol. 2018 Apr 1;47(2):597-604. doi: 10.1093/ije/dyx224.

DOI:10.1093/ije/dyx224
PMID:29136145
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5913627/
Abstract

BACKGROUND

Accurate detection and estimation of true exposure-outcome associations is important in aetiological analysis; when there are multiple potential exposure variables of interest, methods for detecting the subset of variables most likely to have true associations with the outcome of interest are required. Case-cohort studies often collect data on a large number of variables which have not been measured in the entire cohort (e.g. panels of biomarkers). There is a lack of guidance on methods for variable selection in case-cohort studies.

METHODS

We describe and explore the application of three variable selection methods to data from a case-cohort study. These are: (i) selecting variables based on their level of significance in univariable (i.e. one-at-a-time) Prentice-weighted Cox regression models; (ii) stepwise selection applied to Prentice-weighted Cox regression; and (iii) a two-step method which applies a Bayesian variable selection algorithm to obtain posterior probabilities of selection for each variable using multivariable logistic regression followed by effect estimation using Prentice-weighted Cox regression.

RESULTS

Across nine different simulation scenarios, the two-step method demonstrated higher sensitivity and lower false discovery rate than the one-at-a-time and stepwise methods. In an application of the methods to data from the EPIC-InterAct case-cohort study, the two-step method identified an additional two fatty acids as being associated with incident type 2 diabetes, compared with the one-at-a-time and stepwise methods.

CONCLUSIONS

The two-step method enables more powerful and accurate detection of exposure-outcome associations in case-cohort studies. An R package is available to enable researchers to apply this method.

摘要

背景

在病因分析中,准确检测和估计真实的暴露-结局关联非常重要;当存在多个潜在的感兴趣的暴露变量时,需要使用检测与感兴趣的结局有真实关联的变量子集的方法。病例-队列研究通常会收集大量未在整个队列中测量的变量的数据(例如生物标志物面板)。对于病例-队列研究中的变量选择方法,缺乏指导。

方法

我们描述并探讨了三种变量选择方法在病例-队列研究数据中的应用。这些方法是:(i)基于单变量(即逐一变量)Prentice 加权 Cox 回归模型中变量的显著性选择变量;(ii)应用于 Prentice 加权 Cox 回归的逐步选择;(iii)一种两步法,使用多变量逻辑回归获得每个变量的选择后验概率,然后使用 Prentice 加权 Cox 回归进行效应估计,从而应用贝叶斯变量选择算法。

结果

在九个不同的模拟场景中,两步法的敏感性高于逐一变量和逐步法,假发现率较低。在对 EPIC-InterAct 病例-队列研究数据的方法应用中,与逐一变量和逐步法相比,两步法确定了另外两种脂肪酸与 2 型糖尿病的发病有关。

结论

两步法能够更有力、更准确地检测病例-队列研究中的暴露-结局关联。提供了一个 R 包,使研究人员能够应用这种方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/398a/5913627/c3d3d6444683/dyx224f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/398a/5913627/c3d3d6444683/dyx224f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/398a/5913627/c3d3d6444683/dyx224f1.jpg

相似文献

1
A two-step method for variable selection in the analysis of a case-cohort study.病例-队列研究分析中的变量选择两步法。
Int J Epidemiol. 2018 Apr 1;47(2):597-604. doi: 10.1093/ije/dyx224.
2
Model selection in medical research: a simulation study comparing Bayesian model averaging and stepwise regression.医学研究中的模型选择:贝叶斯模型平均与逐步回归比较的模拟研究。
BMC Med Res Methodol. 2010 Dec 6;10:108. doi: 10.1186/1471-2288-10-108.
3
A regularized variable selection procedure in additive hazards model with stratified case-cohort design.具有分层病例队列设计的加法风险模型中的正则化变量选择程序。
Lifetime Data Anal. 2018 Jul;24(3):443-463. doi: 10.1007/s10985-017-9402-7. Epub 2017 Jul 28.
4
Differences in the prospective association between individual plasma phospholipid saturated fatty acids and incident type 2 diabetes: the EPIC-InterAct case-cohort study.个体血浆磷脂饱和脂肪酸与 2 型糖尿病发病风险的前瞻性关联存在差异:EPIC-InterAct 病例-队列研究。
Lancet Diabetes Endocrinol. 2014 Oct;2(10):810-8. doi: 10.1016/S2213-8587(14)70146-9. Epub 2014 Aug 5.
5
Random Survival Forest in practice: a method for modelling complex metabolomics data in time to event analysis.实践中的随机生存森林:一种在时间-事件分析中对复杂代谢组学数据进行建模的方法。
Int J Epidemiol. 2016 Oct;45(5):1406-1420. doi: 10.1093/ije/dyw145. Epub 2016 Sep 1.
6
Nested case-control studies: should one break the matching?巢式病例对照研究:是否应该打破匹配?
Lifetime Data Anal. 2015 Oct;21(4):517-41. doi: 10.1007/s10985-015-9319-y. Epub 2015 Jan 23.
7
Analysis of two-phase sampling data with semiparametric additive hazards models.使用半参数加法风险模型对两阶段抽样数据进行分析。
Lifetime Data Anal. 2017 Jul;23(3):377-399. doi: 10.1007/s10985-016-9363-2. Epub 2016 Mar 19.
8
Bayesian variable selection for multivariate zero-inflated models: Application to microbiome count data.贝叶斯变量选择在多变量零膨胀模型中的应用:在微生物组计数数据中的应用。
Biostatistics. 2020 Jul 1;21(3):499-517. doi: 10.1093/biostatistics/kxy067.
9
Effects of long-term exposure to traffic-related air pollution on respiratory and cardiovascular mortality in the Netherlands: the NLCS-AIR study.长期暴露于交通相关空气污染对荷兰呼吸道和心血管疾病死亡率的影响:荷兰长期队列空气污染研究(NLCS-AIR研究)
Res Rep Health Eff Inst. 2009 Mar(139):5-71; discussion 73-89.
10
A combination of plasma phospholipid fatty acids and its association with incidence of type 2 diabetes: The EPIC-InterAct case-cohort study.血浆磷脂脂肪酸组合及其与2型糖尿病发病率的关联:欧洲癌症与营养前瞻性调查-糖尿病交互作用病例队列研究
PLoS Med. 2017 Oct 11;14(10):e1002409. doi: 10.1371/journal.pmed.1002409. eCollection 2017 Oct.

引用本文的文献

1
Associations of activity, sedentary and sleep behaviors with oral health indictors in children and adolescents: a cross-sectional analysis.儿童和青少年的活动、久坐及睡眠行为与口腔健康指标的关联:一项横断面分析。
J Act Sedentary Sleep Behav. 2024 Jul 29;3(1):18. doi: 10.1186/s44167-024-00057-5.
2
The association between osteoprotegerin and arterial stiffness in a 10-year longitudinal study of patients with type 2 diabetes.在一项针对2型糖尿病患者的10年纵向研究中骨保护素与动脉僵硬度之间的关联。
Diab Vasc Dis Res. 2024 Nov-Dec;21(6):14791641241304435. doi: 10.1177/14791641241304435.
3
The Association between Serum Lipid Profile Levels and Hypertension Grades: A Cross-Sectional Study at a Health Examination Center.

本文引用的文献

1
Variable selection for case-cohort studies with failure time outcome.具有生存时间结局的病例队列研究中的变量选择
Biometrika. 2016 Sep;103(3):547-562. doi: 10.1093/biomet/asw027. Epub 2016 Aug 10.
2
Association of Plasma Phospholipid n-3 and n-6 Polyunsaturated Fatty Acids with Type 2 Diabetes: The EPIC-InterAct Case-Cohort Study.血浆磷脂n-3和n-6多不饱和脂肪酸与2型糖尿病的关联:欧洲癌症与营养前瞻性调查(EPIC)-InterAct病例队列研究
PLoS Med. 2016 Jul 19;13(7):e1002094. doi: 10.1371/journal.pmed.1002094. eCollection 2016 Jul.
3
Learning interactions via hierarchical group-lasso regularization.
血清脂质谱水平与高血压分级之间的关联:在一家健康体检中心进行的横断面研究
High Blood Press Cardiovasc Prev. 2025 Jan;32(1):87-98. doi: 10.1007/s40292-024-00683-9. Epub 2024 Nov 27.
4
Plasma sphingolipids mediate the association between gut microbiome composition and type 2 diabetes risk in the HELIUS cohort: a case-cohort study.血浆神经酰胺介导肠道微生物群组成与 HELIUS 队列 2 型糖尿病风险之间的关联:病例-队列研究。
BMJ Open Diabetes Res Care. 2024 Jul 18;12(4):e004180. doi: 10.1136/bmjdrc-2024-004180.
5
High-dimensional mediation analysis for continuous outcome with confounders using overlap weighting method in observational epigenetic study.高维中介分析在观察性表观遗传学研究中用于混杂因素的连续结果,使用重叠加权方法。
BMC Med Res Methodol. 2024 Jun 3;24(1):125. doi: 10.1186/s12874-024-02254-x.
6
Predicting Adaptations to Resistance Training Plus Overfeeding Using Bayesian Regression: A Preliminary Investigation.使用贝叶斯回归预测对阻力训练加过度喂养的适应性:一项初步调查。
J Funct Morphol Kinesiol. 2021 Apr 21;6(2):36. doi: 10.3390/jfmk6020036.
7
A comprehensive molecular characterization of the 8q22.2 region reveals the prognostic relevance of OSR2 mRNA in muscle invasive bladder cancer.全面的分子特征分析表明,8q22.2 区域内的 OSR2 mRNA 与肌层浸润性膀胱癌的预后相关。
PLoS One. 2021 Mar 12;16(3):e0248342. doi: 10.1371/journal.pone.0248342. eCollection 2021.
8
Prevalence and predictors of work-related musculoskeletal disorders among workers of a gold mine in south Kivu, Democratic Republic of Congo.刚果民主共和国南基伍某金矿工人中与工作相关的肌肉骨骼疾病的患病率及预测因素
BMC Musculoskelet Disord. 2020 Dec 1;21(1):797. doi: 10.1186/s12891-020-03828-8.
9
Secondary analysis of an RCT on Emergency Department-Initiated Tobacco Control: Repeatedly assessed point-prevalence abstinence up to 12 months and extension of results through a 10-year follow-up.一项关于急诊科发起的烟草控制的随机对照试验的二次分析:对长达12个月的点流行率戒烟情况进行重复评估,并通过10年随访扩展结果。
Tob Induc Dis. 2019 Apr 5;17:26. doi: 10.18332/tid/105579. eCollection 2019.
通过分层组套索正则化学习交互作用。
J Comput Graph Stat. 2015;24(3):627-654. doi: 10.1080/10618600.2014.938812. Epub 2015 Sep 16.
4
A method making fewer assumptions gave the most reliable estimates of exposure-outcome associations in stratified case-cohort studies.在分层病例队列研究中,一种假设较少的方法给出了暴露-结局关联的最可靠估计。
J Clin Epidemiol. 2015 Dec;68(12):1397-405. doi: 10.1016/j.jclinepi.2015.04.007. Epub 2015 Apr 30.
5
Weibull regression with Bayesian variable selection to identify prognostic tumour markers of breast cancer survival.采用贝叶斯变量选择的威布尔回归来识别乳腺癌生存的预后肿瘤标志物。
Stat Methods Med Res. 2017 Feb;26(1):414-436. doi: 10.1177/0962280214548748. Epub 2016 Sep 30.
6
Differences in the prospective association between individual plasma phospholipid saturated fatty acids and incident type 2 diabetes: the EPIC-InterAct case-cohort study.个体血浆磷脂饱和脂肪酸与 2 型糖尿病发病风险的前瞻性关联存在差异:EPIC-InterAct 病例-队列研究。
Lancet Diabetes Endocrinol. 2014 Oct;2(10):810-8. doi: 10.1016/S2213-8587(14)70146-9. Epub 2014 Aug 5.
7
Comparison of stopped Cox regression with direct methods such as pseudo-values and binomial regression.停止的Cox回归与直接方法(如伪值和二项回归)的比较。
Lifetime Data Anal. 2015 Apr;21(2):180-96. doi: 10.1007/s10985-014-9299-3. Epub 2014 Aug 2.
8
A review of published analyses of case-cohort studies and recommendations for future reporting.已发表的病例队列研究分析综述及对未来报告的建议。
PLoS One. 2014 Jun 27;9(6):e101176. doi: 10.1371/journal.pone.0101176. eCollection 2014.
9
Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes.大规模的关联分析为 2 型糖尿病的遗传结构和病理生理学提供了深入了解。
Nat Genet. 2012 Sep;44(9):981-90. doi: 10.1038/ng.2383. Epub 2012 Aug 12.
10
Bayesian detection of expression quantitative trait loci hot spots.贝叶斯检测表达数量性状基因座热点。
Genetics. 2011 Dec;189(4):1449-59. doi: 10.1534/genetics.111.131425. Epub 2011 Sep 16.