Xiong Chengjie, Schindler Suzanne E, Henson Rachel L, Wolk David A, Shaw Leslie M, Agboola Folasade, Morris John C, Lu Ruijin, Luo Jingqin
Division of Biostatistics, Washington University School of Medicine, St Louis, MO, USA.
Knight Alzheimer Disease Research Center, Washington University School of Medicine, St Louis, MO, USA.
Stat Methods Med Res. 2024 Feb;33(2):185-202. doi: 10.1177/09622802231215810. Epub 2023 Nov 22.
Evaluating correlations between disease biomarkers and clinical outcomes is crucial in biomedical research. During the early stages of many chronic diseases, changes in biomarkers and clinical outcomes are often subtle. A major challenge to detecting subtle correlations is that studies with large sample sizes are usually needed to achieve sufficient statistical power. This challenge is even greater when biofluid and imaging biomarker data are used because the required procedures are burdensome, perceived as invasive, and/or expensive, limiting sample sizes in individual studies. Combining data across multiple studies may increase statistical power, but biomarker data may be generated using different assay platforms, scanner types, or processing protocols, which may affect measured biomarker values. Therefore, harmonizing biomarker data is essential to combining data across studies. Bridging studies involve re-processing of a subset of samples or imaging scans to evaluate how biomarker values vary by studies. This presents an analytic challenge on how to best harmonize biomarker data across studies to allow unbiased and optimal estimates of their correlations with standardized clinical outcomes. We conceptualize that a latent biomarker underlies the observed biomarkers across studies, and propose a novel approach that integrates the data in the bridging study with the study-specific biomarker data for estimating the biological correlations between biomarkers and clinical outcomes. Through extensive simulations, we compare our method to several alternative methods/algorithms often used to estimate the correlations. Finally, we demonstrate the application of this methodology to a real-world multi-center Alzheimer's disease biomarker study to correlate cerebrospinal fluid biomarker concentrations with cognitive outcomes.
评估疾病生物标志物与临床结局之间的相关性在生物医学研究中至关重要。在许多慢性疾病的早期阶段,生物标志物和临床结局的变化通常很细微。检测细微相关性的一个主要挑战是,通常需要大样本量的研究才能获得足够的统计效力。当使用生物流体和成像生物标志物数据时,这一挑战更大,因为所需的程序繁琐、被认为具有侵入性和/或成本高昂,限制了单个研究的样本量。合并多个研究的数据可能会提高统计效力,但生物标志物数据可能是使用不同的检测平台、扫描仪类型或处理方案生成的,这可能会影响测得的生物标志物值。因此,协调生物标志物数据对于合并跨研究的数据至关重要。衔接性研究涉及对一部分样本或成像扫描进行重新处理,以评估生物标志物值如何因研究而异。这就带来了一个分析挑战,即如何最好地协调跨研究的生物标志物数据,以便对其与标准化临床结局的相关性进行无偏且最优的估计。我们设想在所有研究中观察到的生物标志物背后存在一个潜在的生物标志物,并提出一种新方法,将衔接性研究中的数据与特定研究的生物标志物数据整合起来,以估计生物标志物与临床结局之间的生物学相关性。通过广泛的模拟,我们将我们的方法与几种常用于估计相关性的替代方法/算法进行了比较。最后,我们展示了该方法在一项真实世界的多中心阿尔茨海默病生物标志物研究中的应用,以将脑脊液生物标志物浓度与认知结局相关联。