Department of Epidemiology and Biostatistics, College of Public Health, University of Arizona, Tucson, Arizona, USA.
National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, Maryland, USA.
Stat Med. 2023 Jun 30;42(14):2275-2292. doi: 10.1002/sim.9723. Epub 2023 Mar 30.
Missing covariate problems are common in biomedical and electrical medical record data studies while evaluating the relationship between a biomarker and certain clinical outcome, when biomarker data are not collected for all subjects. However, missingness mechanism is unverifiable based on observed data. If there is a suspicion of missing not at random (MNAR), researchers often perform sensitivity analysis to evaluate the impact of various missingness mechanisms. Under the selection modeling framework, we propose a sensitivity analysis approach with a standardized sensitivity parameter using a nonparametric multiple imputation strategy. The proposed approach requires fitting two working models to derive two predictive scores: one for predicting missing covariate values and the other for predicting missingness probabilities. For each missing covariate observation, the two predictive scores along with the pre-specified sensitivity parameter are used to define an imputing set. The proposed approach is expected to be robust against mis-specifications of the selection model and the sensitivity parameter since the selection model and the sensitivity parameter are not directly used to impute missing covariate values. A simulation study is conducted to study the performance of the proposed approach when MNAR is induced by Heckman's selection model. Simulation results show the proposed approach can produce plausible regression coefficient estimates. The proposed sensitivity analysis approach is also applied to evaluate the impact of MNAR on the relationship between post-operative outcomes and incomplete pre-operative Hemoglobin A1c level for patients who underwent carotid intervetion for advanced atherosclerotic disease.
在评估生物标志物与某些临床结局之间的关系时,如果并非所有受试者的生物标志物数据都被采集到,那么在生物医学和电子病历数据研究中,常常会出现协变量缺失问题。然而,基于观察数据,缺失机制是无法验证的。如果怀疑存在有选择的缺失(MNAR),研究人员通常会进行敏感性分析,以评估各种缺失机制的影响。在选择建模框架下,我们提出了一种使用非参数多重插补策略的标准化敏感性参数的敏感性分析方法。该方法需要拟合两个工作模型,以推导出两个预测分数:一个用于预测缺失协变量值,另一个用于预测缺失概率。对于每个缺失的协变量观察值,使用两个预测分数以及预先指定的敏感性参数来定义一个插补集。由于选择模型和敏感性参数不会直接用于插补缺失的协变量值,因此该方法有望对选择模型和敏感性参数的误设定具有稳健性。当 MNAR 由 Heckman 的选择模型引起时,进行了一项模拟研究,以研究所提出方法的性能。模拟结果表明,该方法可以产生合理的回归系数估计。还应用所提出的敏感性分析方法来评估 MNAR 对接受颈动脉介入治疗的晚期动脉粥样硬化疾病患者术后结局与不完整术前糖化血红蛋白水平之间关系的影响。