Department of Data Science, The Institute of Statistical Mathematics.
Department of Statistics, Radiation Effects Research Foundation.
J Epidemiol. 2023 Oct 5;33(10):508-513. doi: 10.2188/jea.JE20210509. Epub 2022 Oct 19.
In case-cohort studies with binary outcomes, ordinary logistic regression analyses have been widely used because of their computational simplicity. However, the resultant odds ratio estimates cannot be interpreted as relative risk measures unless the event rate is low. The risk ratio and risk difference are more favorable outcome measures that are directly interpreted as effect measures without the rare disease assumption.
We provide pseudo-Poisson and pseudo-normal linear regression methods for estimating risk ratios and risk differences in analyses of case-cohort studies. These multivariate regression models are fitted by weighting the inverses of sampling probabilities. Also, the precisions of the risk ratio and risk difference estimators can be improved using auxiliary variable information, specifically by adapting the calibrated or estimated weights, which are readily measured on all samples from the whole cohort. Finally, we provide computational code in R (R Foundation for Statistical Computing, Vienna, Austria) that can easily perform these methods.
Through numerical analyses of artificially simulated data and the National Wilms Tumor Study data, accurate risk ratio and risk difference estimates were obtained using the pseudo-Poisson and pseudo-normal linear regression methods. Also, using the auxiliary variable information from the whole cohort, precisions of these estimators were markedly improved.
The ordinary logistic regression analyses may provide uninterpretable effect measure estimates, and the risk ratio and risk difference estimation methods are effective alternative approaches for case-cohort studies. These methods are especially recommended under situations in which the event rate is not low.
在二分类结局的病例-队列研究中,由于其计算简单,通常使用普通逻辑回归分析。然而,所得的优势比估计值不能解释为相对风险度量,除非事件发生率低。风险比和风险差是更有利的结局指标,可以直接解释为效应度量,而无需稀有疾病假设。
我们提供了用于估计病例-队列研究中风险比和风险差的拟泊松和拟正态线性回归方法。这些多变量回归模型通过加权抽样概率的倒数来拟合。此外,通过利用辅助变量信息(特别是通过适应校准或估计的权重)可以提高风险比和风险差估计量的精度,这些权重可以很容易地从整个队列的所有样本中测量。最后,我们提供了 R 中的计算代码(奥地利维也纳的 R 基金会统计计算),可以轻松执行这些方法。
通过对人工模拟数据和国家威尔姆斯肿瘤研究数据的数值分析,使用拟泊松和拟正态线性回归方法得到了准确的风险比和风险差估计值。此外,通过使用整个队列的辅助变量信息,这些估计量的精度显著提高。
普通逻辑回归分析可能提供不可解释的效应度量估计值,风险比和风险差估计方法是病例-队列研究的有效替代方法。这些方法在事件发生率不低的情况下尤其推荐使用。