Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol BS8 2BN, UK; MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK.
Department of Medical Statistics, London School of Hygiene & Tropical Medicine, WC1E 7HT, London, UK.
J Clin Epidemiol. 2023 Feb;154:33-41. doi: 10.1016/j.jclinepi.2022.11.022. Epub 2022 Dec 1.
To investigate whether a complete case logistic regression gives a biased estimate of the exposure odds ratio (OR) if missingness depends on a continuous outcome, but a binary version is used for analysis; to examine whether any bias could be reduced by including a misclassified form of the incomplete outcome as an auxiliary variable in multiple imputation (MI).
Analytical investigation, simulation study, and data from a UK cohort.
There was bias in the exposure OR when the probability of being a complete case was independently associated with the exposure and (continuous) outcome but this was generally small unless the association with the outcome was strong. Where exposure and (continuous) outcome interacted in their effect on this probability, the bias was large, particularly at high levels of missing data. Inclusion of the auxiliary variable resulted in important bias reductions when this had high sensitivity and specificity.
The robustness of logistic regression to missing data is not maintained when the outcome is a binary version of an underlying continuous measure, but the bias will be small unless the association between the continuous outcome and missingness is strong.
如果缺失数据取决于连续结果,而分析中使用了二元版本的完全案例逻辑回归,那么这种方法是否会对暴露比值比(OR)的估计产生偏差;通过将不完全结果的一种分类形式作为辅助变量纳入多重插补(MI)中,考察是否可以减少任何偏差。
分析性研究、模拟研究以及来自英国队列的数据。
当完全案例的概率与暴露和(连续)结局独立相关时,暴露 OR 存在偏差,但通常很小,除非与结局的关联很强。当暴露和(连续)结局在影响该概率方面相互作用时,偏差较大,尤其是在缺失数据水平较高的情况下。当辅助变量具有较高的灵敏度和特异性时,纳入辅助变量会导致重要的偏差减少。
当结局是潜在连续测量的二进制版本时,逻辑回归对缺失数据的稳健性不再得到维持,但除非连续结局和缺失之间的关联很强,否则偏差很小。