Suppr超能文献

广义线性模型下因变量依赖抽样设计的模型误设与稳健分析。

Model misspecification and robust analysis for outcome-dependent sampling designs under generalized linear models.

机构信息

Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA.

Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee, USA.

出版信息

Stat Med. 2023 Apr 30;42(9):1338-1352. doi: 10.1002/sim.9673. Epub 2023 Feb 9.

Abstract

Outcome-dependent sampling (ODS) is a commonly used class of sampling designs to increase estimation efficiency in settings where response information (and possibly adjuster covariates) is available, but the exposure is expensive and/or cumbersome to collect. We focus on ODS within the context of a two-phase study, where in Phase One the response and adjuster covariate information is collected on a large cohort that is representative of the target population, but the expensive exposure variable is not yet measured. In Phase Two, using response information from Phase One, we selectively oversample a subset of informative subjects in whom we collect expensive exposure information. Importantly, the Phase Two sample is no longer representative, and we must use ascertainment-correcting analysis procedures for valid inferences. In this paper, we focus on likelihood-based analysis procedures, particularly a conditional-likelihood approach and a full-likelihood approach. Whereas the full-likelihood retains incomplete Phase One data for subjects not selected into Phase Two, the conditional-likelihood explicitly conditions on Phase Two sample selection (ie, it is a "complete case" analysis procedure). These designs and analysis procedures are typically implemented assuming a known, parametric model for the response distribution. However, in this paper, we approach analyses implementing a novel semi-parametric extension to generalized linear models (SPGLM) to develop likelihood-based procedures with improved robustness to misspecification of distributional assumptions. We specifically focus on the common setting where standard GLM distributional assumptions are not satisfied (eg, misspecified mean/variance relationship). We aim to provide practical design guidance and flexible tools for practitioners in these settings.

摘要

基于结果的抽样 (ODS) 是一种常用的抽样设计方法,用于在存在响应信息(可能还有调整器协变量)的情况下提高估计效率,但暴露情况昂贵且/或难以收集。我们专注于两阶段研究背景下的 ODS,在第一阶段,在具有代表性的目标人群的大样本中收集响应和调整器协变量信息,但尚未测量昂贵的暴露变量。在第二阶段,利用第一阶段的响应信息,我们有选择地对信息丰富的部分受试者进行过度抽样,在这些受试者中我们收集昂贵的暴露信息。重要的是,第二阶段的样本不再具有代表性,我们必须使用确证校正分析程序进行有效推断。在本文中,我们专注于基于似然的分析程序,特别是条件似然方法和完全似然方法。虽然完全似然为未被选入第二阶段的受试者保留了不完全的第一阶段数据,但条件似然明确条件是第二阶段样本选择(即,它是一种“完整案例”分析程序)。这些设计和分析程序通常在假设响应分布的已知、参数模型的情况下实施。然而,在本文中,我们采用广义线性模型(GLM)的新半参数扩展来实施分析,以开发具有改进的分布假设指定稳健性的基于似然的程序。我们特别关注标准 GLM 分布假设不满足的常见情况(例如,指定错误的均值/方差关系)。我们旨在为这些情况下的从业者提供实用的设计指导和灵活的工具。

相似文献

2
Generalized case-control sampling under generalized linear models.广义线性模型下的广义病例对照抽样。
Biometrics. 2023 Mar;79(1):332-343. doi: 10.1111/biom.13571. Epub 2021 Oct 12.

本文引用的文献

1
Generalized case-control sampling under generalized linear models.广义线性模型下的广义病例对照抽样。
Biometrics. 2023 Mar;79(1):332-343. doi: 10.1111/biom.13571. Epub 2021 Oct 12.
4
Generalized linear models with unspecified reference distribution.具有未指定参考分布的广义线性模型。
Biostatistics. 2009 Apr;10(2):205-18. doi: 10.1093/biostatistics/kxn030. Epub 2008 Sep 29.
8
Statistics in epidemiology: the case-control study.流行病学中的统计学:病例对照研究。
J Am Stat Assoc. 1996 Mar;91(433):14-28. doi: 10.1080/01621459.1996.10476660.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验