Newlin Nancy R, Kanakaraj Praitayini, Li Thomas, Pechman Kimberly, Archer Derek, Jefferson Angela, Landman Bennett, Moyer Daniel
Department of Computer Science, Vanderbilt University, Nashville, TN, USA.
Department of Biomedical Engineering, Vanderbilt University, Nashville, Tennessee, USA.
Proc SPIE Int Soc Opt Eng. 2024 Feb;12930. doi: 10.1117/12.3009645. Epub 2024 Apr 2.
Multi-site diffusion MRI data is often acquired on different scanners and with distinct protocols. Differences in hardware and acquisition result in data that contains site dependent information, which confounds connectome analyses aiming to combine such multi-site data. We propose a data-driven solution that isolates site-invariant information whilst maintaining relevant features of the connectome. We construct a latent space that is uncorrelated with the imaging site and highly correlated with patient age and a connectome summary measure. Here, we focus on network modularity. The proposed model is a conditional, variational autoencoder with three additional prediction tasks: one for patient age, and two for modularity trained exclusively on data from each site. This model enables us to 1) isolate site-invariant biological features, 2) learn site context, and 3) re-inject site context and project biological features to desired site domains. We tested these hypotheses by projecting 77 connectomes from two studies and protocols (Vanderbilt Memory and Aging Project (VMAP) and Biomarkers of Cognitive Decline Among Normal Individuals (BIOCARD) to a common site. We find that the resulting dataset of modularity has statistically similar means (p-value <0.05) across sites. In addition, we fit a linear model to the joint dataset and find that positive correlations between age and modularity were preserved.
多站点扩散磁共振成像(MRI)数据通常是在不同的扫描仪上采用不同的协议采集的。硬件和采集方式的差异导致数据包含与站点相关的信息,这会干扰旨在合并此类多站点数据的连接组分析。我们提出了一种数据驱动的解决方案,该方案在保留连接组相关特征的同时分离出与站点无关的信息。我们构建了一个与成像站点不相关且与患者年龄和连接组汇总指标高度相关的潜在空间。在此,我们重点关注网络模块化。所提出的模型是一个条件变分自编码器,带有三个额外的预测任务:一个用于预测患者年龄,另外两个用于仅在每个站点的数据上训练的模块化预测。该模型使我们能够:1)分离出与站点无关的生物学特征;2)学习站点背景;3)重新注入站点背景并将生物学特征投影到所需的站点域。我们通过将来自两项研究和协议(范德比尔特记忆与衰老项目(VMAP)和正常个体认知衰退生物标志物(BIOCARD))的77个连接组投影到一个共同的站点来检验这些假设。我们发现,由此产生的模块化数据集在各站点之间具有统计学上相似的均值(p值<;0.05)。此外,我们对联合数据集拟合了一个线性模型,发现年龄与模块化之间的正相关关系得以保留。