Devlin Sean M, Satagopan Jaya M
Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
Hum Hered. 2016;82(1-2):21-36. doi: 10.1159/000477125. Epub 2017 Jul 26.
Logistic regression is widely used to evaluate the association between risk factors and a binary outcome. The logistic curve is symmetric around its point of inflection. Alternative families of curves, such as the additive Gompertz or Guerrero-Johnson models, have been proposed in various scenarios due to their asymmetry: disease risk may initially increase rapidly and be followed by a longer period where the rate of growth slowly decreases. When modeling binary outcomes in relation to risk factors, an additive logistic model may not provide a good fit to the data. Suppose the outcome and an additive function of the risk factors are indeed related through an asymmetric function, but we model the relationship using a logistic function. We illustrate - both from a mathematical framework and through a simulation-based evaluation - that higher-order terms, such as pairwise interactions and quadratic terms, may be required in a logistic regression model to obtain a good fit to the data. Importantly, as significant higher-order terms may be a manifestation of model misspecification, these terms should be cautiously interpreted; a more pragmatic approach is to develop contrasts of disease risk coming from a good fitting model. We illustrate these concepts in 2 cohort studies examining early death for late-stage colorectal and pancreatic cancer cases, and 2 case-control studies investigating NAT2 acetylation, smoking, and advanced colorectal adenoma and bladder cancer.
逻辑回归被广泛用于评估风险因素与二元结局之间的关联。逻辑曲线关于其拐点对称。由于其不对称性,在各种情况下已经提出了其他曲线族,如加法冈珀茨模型或格雷罗 - 约翰逊模型:疾病风险可能最初迅速增加,随后是一个较长的时期,在此期间增长率缓慢下降。当对与风险因素相关的二元结局进行建模时,加法逻辑模型可能无法很好地拟合数据。假设结局与风险因素的加法函数确实通过一个不对称函数相关,但我们使用逻辑函数对这种关系进行建模。我们从数学框架和基于模拟的评估两方面说明,在逻辑回归模型中可能需要高阶项,如成对交互项和二次项,才能很好地拟合数据。重要的是,由于显著的高阶项可能是模型误设的表现,这些项应谨慎解释;一种更务实的方法是从一个拟合良好的模型中得出疾病风险的对比。我们在两项队列研究中说明了这些概念,这两项队列研究考察了晚期结直肠癌和胰腺癌病例的早期死亡情况,以及两项病例对照研究,调查了NAT2乙酰化、吸烟与晚期结直肠腺瘤和膀胱癌的关系。