Department of Community Health Sciences, Brock University, 500 Glenridge Ave, St Catharines, ON, Canada.
J Natl Cancer Inst. 2011 Jul 6;103(13):1058-68. doi: 10.1093/jnci/djr173. Epub 2011 May 23.
Identification of individuals at high risk for lung cancer should be of value to individuals, patients, clinicians, and researchers. Existing prediction models have only modest capabilities to classify persons at risk accurately.
Prospective data from 70 962 control subjects in the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial (PLCO) were used in models for the general population (model 1) and for a subcohort of ever-smokers (N = 38 254) (model 2). Both models included age, socioeconomic status (education), body mass index, family history of lung cancer, chronic obstructive pulmonary disease, recent chest x-ray, smoking status (never, former, or current), pack-years smoked, and smoking duration. Model 2 also included smoking quit-time (time in years since ever-smokers permanently quit smoking). External validation was performed with 44 223 PLCO intervention arm participants who completed a supplemental questionnaire and were subsequently followed. Known available risk factors were included in logistic regression models. Bootstrap optimism-corrected estimates of predictive performance were calculated (internal validation). Nonlinear relationships for age, pack-years smoked, smoking duration, and quit-time were modeled using restricted cubic splines. All reported P values are two-sided.
During follow-up (median 9.2 years) of the control arm subjects, 1040 lung cancers occurred. During follow-up of the external validation sample (median 3.0 years), 213 lung cancers occurred. For models 1 and 2, bootstrap optimism-corrected receiver operator characteristic area under the curves were 0.857 and 0.805, and calibration slopes (model-predicted probabilities vs observed probabilities) were 0.987 and 0.979, respectively. In the external validation sample, models 1 and 2 had area under the curves of 0.841 and 0.784, respectively. These models had high discrimination in women, men, whites, and nonwhites.
The PLCO lung cancer risk models demonstrate high discrimination and calibration.
识别肺癌高危个体对个人、患者、临床医生和研究人员都具有重要意义。现有的预测模型在准确分类风险人群方面能力有限。
使用前列腺癌、肺癌、结直肠癌和卵巢癌筛查试验(PLCO)中 70962 名对照受试者的前瞻性数据,建立一般人群模型(模型 1)和曾吸烟者亚组模型(模型 2)(N=38254)。两个模型均包含年龄、社会经济地位(教育)、体重指数、肺癌家族史、慢性阻塞性肺疾病、近期胸部 X 线、吸烟状况(从不、以前、现在)、吸烟年包数和吸烟持续时间。模型 2还包括戒烟时间(从不吸烟者永久戒烟后至今的年数)。利用完成补充问卷并随后进行随访的 44223 名 PLCO 干预组参与者进行外部验证。将已知的可用风险因素纳入逻辑回归模型。通过自举法计算预测性能的校正置信区间(内部验证)。年龄、吸烟年包数、吸烟持续时间和戒烟时间的非线性关系采用限制立方样条进行建模。所有报告的 P 值均为双侧。
在对照组受试者的中位随访时间为 9.2 年期间,发生了 1040 例肺癌。在外部验证样本的中位随访时间为 3.0 年期间,发生了 213 例肺癌。模型 1 和模型 2 的自举法校正接收者操作特征曲线下面积分别为 0.857 和 0.805,校准斜率(模型预测概率与观察概率)分别为 0.987 和 0.979。在外部验证样本中,模型 1 和模型 2 的曲线下面积分别为 0.841 和 0.784。这些模型在女性、男性、白人和非白人中均具有较高的区分度。
PLCO 肺癌风险模型具有较高的区分度和校准度。