Suppr超能文献

通过多重填补在巢式病例对照研究和病例队列研究中使用全队列数据。

Using full-cohort data in nested case-control and case-cohort studies by multiple imputation.

作者信息

Keogh Ruth H, White Ian R

机构信息

MRC Biostatistics Unit, Cambridge, U.K.; Department of Medical Statistics, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, U.K.

出版信息

Stat Med. 2013 Oct 15;32(23):4021-43. doi: 10.1002/sim.5818. Epub 2013 Apr 23.

Abstract

In many large prospective cohorts, expensive exposure measurements cannot be obtained for all individuals. Exposure-disease association studies are therefore often based on nested case-control or case-cohort studies in which complete information is obtained only for sampled individuals. However, in the full cohort, there may be a large amount of information on cheaply available covariates and possibly a surrogate of the main exposure(s), which typically goes unused. We view the nested case-control or case-cohort study plus the remainder of the cohort as a full-cohort study with missing data. Hence, we propose using multiple imputation (MI) to utilise information in the full cohort when data from the sub-studies are analysed. We use the fully observed data to fit the imputation models. We consider using approximate imputation models and also using rejection sampling to draw imputed values from the true distribution of the missing values given the observed data. Simulation studies show that using MI to utilise full-cohort information in the analysis of nested case-control and case-cohort studies can result in important gains in efficiency, particularly when a surrogate of the main exposure is available in the full cohort. In simulations, this method outperforms counter-matching in nested case-control studies and a weighted analysis for case-cohort studies, both of which use some full-cohort information. Approximate imputation models perform well except when there are interactions or non-linear terms in the outcome model, where imputation using rejection sampling works well.

摘要

在许多大型前瞻性队列研究中,无法为所有个体获取昂贵的暴露测量数据。因此,暴露-疾病关联研究通常基于巢式病例对照研究或病例队列研究,在这些研究中,仅对抽样个体获取完整信息。然而,在整个队列中,可能存在大量关于廉价可得协变量以及可能的主要暴露替代指标的信息,而这些信息通常未被利用。我们将巢式病例对照研究或病例队列研究以及队列的其余部分视为一个存在缺失数据的全队列研究。因此,我们建议在分析子研究数据时使用多重填补(MI)来利用全队列中的信息。我们使用完全观测到的数据来拟合填补模型。我们考虑使用近似填补模型,也考虑使用拒绝抽样从给定观测数据的缺失值真实分布中抽取填补值。模拟研究表明,在巢式病例对照研究和病例队列研究的分析中使用MI来利用全队列信息可显著提高效率,特别是当全队列中存在主要暴露的替代指标时。在模拟中,该方法在巢式病例对照研究中优于配对对照,在病例队列研究中优于加权分析,后两者都使用了一些全队列信息。近似填补模型表现良好,除非结局模型中存在交互作用或非线性项,此时使用拒绝抽样进行填补效果良好。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验