缺失失效亚组的 Cox 模型中逆概率加权与多重插补的比较

Comparison between inverse-probability weighting and multiple imputation in Cox model with missing failure subtype.

机构信息

Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.

Division of Biostatistics, University of Minnesota, Minneapolis, MN, USA.

出版信息

Stat Methods Med Res. 2024 Feb;33(2):344-356. doi: 10.1177/09622802231226328. Epub 2024 Jan 23.

Abstract

Identifying and distinguishing risk factors for heterogeneous disease subtypes has been of great interest. However, missingness in disease subtypes is a common problem in those data analyses. Several methods have been proposed to deal with the missing data, including complete-case analysis, inverse-probability weighting, and multiple imputation. Although extant literature has compared these methods in missing problems, none has focused on the competing risk setting. In this paper, we discuss the assumptions required when complete-case analysis, inverse-probability weighting, and multiple imputation are used to deal with the missing failure subtype problem, focusing on how to implement these methods under various realistic scenarios in competing risk settings. Besides, we compare these three methods regarding their biases, efficiency, and robustness to model misspecifications using simulation studies. Our results show that complete-case analysis can be seriously biased when the missing completely at random assumption does not hold. Inverse-probability weighting and multiple imputation estimators are valid when we correctly specify the corresponding models for missingness and for imputation, and multiple imputation typically shows higher efficiency than inverse-probability weighting. However, in real-world studies, building imputation models for the missing subtypes can be more challenging than building missingness models. In that case, inverse-probability weighting could be preferred for its easy usage. We also propose two automated model selection procedures and demonstrate their usage in a study of the association between smoking and colorectal cancer subtypes in the Nurses' Health Study and Health Professional Follow-Up Study.

摘要

识别和区分疾病亚型的风险因素一直是人们关注的焦点。然而,在这些数据分析中,疾病亚型的缺失是一个常见的问题。已经提出了几种方法来处理缺失数据,包括完全案例分析、逆概率加权和多重插补。尽管现有文献已经比较了这些方法在缺失问题上的表现,但没有一个文献专门关注竞争风险设置。在本文中,我们讨论了在处理缺失失败亚型问题时完全案例分析、逆概率加权和多重插补所需要的假设,重点讨论了如何在竞争风险设置下的各种现实场景中实现这些方法。此外,我们还通过模拟研究比较了这三种方法在偏差、效率和对模型误设定的稳健性方面的表现。我们的结果表明,当完全随机缺失假设不成立时,完全案例分析可能会产生严重的偏差。当我们正确指定缺失和插补的相应模型时,逆概率加权和多重插补估计量是有效的,并且多重插补通常比逆概率加权具有更高的效率。然而,在实际研究中,为缺失的亚型构建插补模型可能比构建缺失模型更具挑战性。在这种情况下,逆概率加权可能因其易用性而更受欢迎。我们还提出了两种自动模型选择程序,并在一项关于吸烟与护士健康研究和健康专业人员随访研究中结直肠癌亚型之间关联的研究中演示了它们的用法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索