Han Yongli, Liu Danping
Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, USA.
Stat Methods Med Res. 2020 Feb;29(2):396-412. doi: 10.1177/0962280219833089. Epub 2019 Mar 10.
Longitudinally measured biomarkers are useful to predict the risk of clinical endpoints, since subject-specific marker trajectory contains additional information on pathology and critical windows. The work is motivated by the Scandinavian Fetal Growth Study, aiming at predicting pregnancy outcomes with repeated ultrasound measurements during pregnancy. While the observation time of markers often varies across individuals, it is not well understood how the variations affect risk prediction. Existing methods of longitudinal risk prediction, such as shared random effects model and pattern mixture model, construct a prediction implicitly as a function of the biomarkers and their observation time. Methods that ignore the longitudinal structure, such as sufficient dimension reduction and logistic regression, have better interpretability regarding how a biomarker measured at specific time window contributes to the disease risk, but often have reduced accuracy because of ignoring the observation time information. We propose a novel imputation approach to handle the random observation time, while preserving the direct interpretation. Through extensive simulation studies and analyses of the Scandinavian Fetal Growth Study data, we systematically compared the discrimination and calibration performance of different risk prediction methods, and found that the imputation method has comparable performance to longitudinal methods with an advantage of better interpretability.
纵向测量的生物标志物有助于预测临床终点的风险,因为个体特异性的标志物轨迹包含了有关病理和关键窗口期的额外信息。这项工作的灵感来自于斯堪的纳维亚胎儿生长研究,旨在通过孕期重复超声测量来预测妊娠结局。虽然标志物的观察时间在个体间通常有所不同,但对于这些差异如何影响风险预测,人们还了解得不够透彻。现有的纵向风险预测方法,如共享随机效应模型和模式混合模型,将预测隐含地构建为生物标志物及其观察时间的函数。忽略纵向结构的方法,如充分降维和逻辑回归,在特定时间窗口测量的生物标志物如何影响疾病风险方面具有更好的可解释性,但由于忽略了观察时间信息,往往准确性较低。我们提出了一种新颖的插补方法来处理随机观察时间,同时保留直接的可解释性。通过广泛的模拟研究以及对斯堪的纳维亚胎儿生长研究数据的分析,我们系统地比较了不同风险预测方法的区分度和校准性能,发现插补方法具有与纵向方法相当的性能,且具有更好可解释性的优势。