Hsu Li, Gorfine Malka, Zucker David M
Biostatistics and Biomathematics, Fred Hutchinson Cancer Research Center.
Department of Statistics and Operations Research, Tel Aviv University.
J Am Stat Assoc. 2018;113(522):560-570. doi: 10.1080/01621459.2017.1356315. Epub 2018 Jun 12.
The population-based case-control study design has been widely used for studying the etiology of chronic diseases. It is well established that the Cox proportional hazards model can be adapted to the case-control study and hazard ratios can be estimated by (conditional) logistic regression model with time as either a matched set or a covariate (Prentice and Breslow, 1978). However, the baseline hazard function, a critical component in absolute risk assessment, is unidentifiable, because the ratio of cases and controls is controlled by the investigators and does not reflect the true disease incidence rate in the population. In this paper we propose a simple and innovative approach, which makes use of routinely collected family history information, to estimate the baseline hazard function for any logistic regression model that is fit to the risk factor data collected on cases and controls. We establish that the proposed baseline hazard function estimator is consistent and asymptotically normal and show via simulation that it performs well in finite samples. We illustrate the proposed method by a population-based case-control study of prostate cancer where the association of various risk factors is assessed and the family history information is used to estimate the baseline hazard function.
基于人群的病例对照研究设计已被广泛用于研究慢性病的病因。众所周知,Cox比例风险模型可适用于病例对照研究,并且风险比可通过以时间作为匹配集或协变量的(条件)逻辑回归模型来估计(Prentice和Breslow,1978)。然而,绝对风险评估中的关键组成部分——基线风险函数是无法识别的,因为病例与对照的比例由研究者控制,并不反映人群中的真实疾病发病率。在本文中,我们提出了一种简单且创新的方法,该方法利用常规收集的家族史信息,为任何适合病例和对照所收集风险因素数据的逻辑回归模型估计基线风险函数。我们证明所提出的基线风险函数估计量是一致的且渐近正态,并通过模拟表明它在有限样本中表现良好。我们通过一项基于人群的前列腺癌病例对照研究来说明所提出的方法,在该研究中评估了各种风险因素的关联,并使用家族史信息来估计基线风险函数。