Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, USA.
Department of Internal Medicine/Nephrology, University of Michigan, Ann Arbor, Michigan, USA.
Stat Med. 2020 Nov 20;39(26):3685-3699. doi: 10.1002/sim.8687. Epub 2020 Jul 27.
Longitudinal biomarker data are often collected in studies, providing important information regarding the probability of an outcome of interest occurring at a future time. With many new and evolving technologies for biomarker discovery, the number of biomarker measurements available for analysis of disease progression has increased dramatically. A large amount of data provides a more complete picture of a patient's disease progression, potentially allowing us to make more accurate and reliable predictions, but the magnitude of available data introduces challenges to most statistical analysts. Existing approaches suffer immensely from the curse of dimensionality. In this article, we propose methods for making dynamic risk predictions using repeatedly measured biomarkers of a large dimension, including cases when the number of biomarkers is close to the sample size. The proposed methods are computationally simple, yet sufficiently flexible to capture complex relationships between longitudinal biomarkers and potentially censored events times. The proposed approaches are evaluated by extensive simulation studies and are further illustrated by an application to a data set from the Nephrotic Syndrome Study Network.
纵向生物标志物数据通常在研究中收集,提供了有关未来某个时间点感兴趣的结果发生概率的重要信息。随着新的生物标志物发现技术不断发展,用于分析疾病进展的生物标志物测量数量大大增加。大量的数据提供了患者疾病进展更完整的图景,有可能使我们做出更准确和可靠的预测,但可用数据的规模给大多数统计分析师带来了挑战。现有的方法在维数灾难面前受到了极大的影响。在本文中,我们提出了使用大量重复测量的生物标志物进行动态风险预测的方法,包括当生物标志物的数量接近样本量的情况。所提出的方法计算简单,但足够灵活,可以捕捉纵向生物标志物和潜在删失事件时间之间的复杂关系。所提出的方法通过广泛的模拟研究进行评估,并通过对来自肾病综合征研究网络的数据的应用进一步说明。