Liu Dandan, Cai Tianxi, Zheng Yingye
Department of Biostatistics, Vanderbilt University, Nashville, TN 37232, USA.
Biometrics. 2012 Dec;68(4):1219-27. doi: 10.1111/j.1541-0420.2012.01787.x. Epub 2012 Nov 22.
Identification of novel biomarkers for risk assessment is important for both effective disease prevention and optimal treatment recommendation. Discovery relies on the precious yet limited resource of stored biological samples from large prospective cohort studies. Case-cohort sampling design provides a cost-effective tool in the context of biomarker evaluation, especially when the clinical condition of interest is rare. Existing statistical methods focus on making efficient inference on relative hazard parameters from the Cox regression model. Drawing on recent theoretical development on the weighted likelihood for semiparametric models under two-phase studies (Breslow and Wellner, 2007), we propose statistical methods to evaluate accuracy and predictiveness of a risk prediction biomarker, with censored time-to-event outcome under stratified case-cohort sampling. We consider nonparametric methods and a semiparametric method. We derive large sample properties of proposed estimators and evaluate their finite sample performance using numerical studies. We illustrate new procedures using data from Framingham Offspring Study to evaluate the accuracy of a recently developed risk score incorporating biomarker information for predicting cardiovascular disease.
识别用于风险评估的新型生物标志物对于有效的疾病预防和最佳治疗建议都很重要。发现依赖于大型前瞻性队列研究中储存的生物样本这一珍贵但有限的资源。病例队列抽样设计在生物标志物评估方面提供了一种具有成本效益的工具,特别是当感兴趣的临床情况罕见时。现有的统计方法侧重于从Cox回归模型对相对风险参数进行有效推断。借鉴两阶段研究下半参数模型加权似然的最新理论发展(Breslow和Wellner,2007),我们提出了统计方法来评估风险预测生物标志物的准确性和预测性,在分层病例队列抽样下具有删失的事件发生时间结局。我们考虑非参数方法和半参数方法。我们推导了所提出估计量的大样本性质,并使用数值研究评估它们的有限样本性能。我们使用弗明汉后代研究的数据说明了新程序,以评估最近开发的纳入生物标志物信息的风险评分预测心血管疾病的准确性。