Centre for Sleep and Cognition & Centre for Translational MR Research, Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Department of Electrical and Computer Engineering, National University of Singapore, Singapore; Department of Medicine, Healthy Longevity Translational Research Programme, Human Potential Translational Research Programme & Institute for Digital Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore; N.1 Institute for Health, National University of Singapore, Singapore.
Centre for Sleep and Cognition & Centre for Translational MR Research, Yong Loo Lin School of Medicine, National University of Singapore, Singapore.
Med Image Anal. 2025 Jan;99:103354. doi: 10.1016/j.media.2024.103354. Epub 2024 Sep 21.
Pooling MRI data from multiple datasets requires harmonization to reduce undesired inter-site variabilities, while preserving effects of biological variables (or covariates). The popular harmonization approach ComBat uses a mixed effect regression framework that explicitly accounts for covariate distribution differences across datasets. There is also significant interest in developing harmonization approaches based on deep neural networks (DNNs), such as conditional variational autoencoder (cVAE). However, current DNN approaches do not explicitly account for covariate distribution differences across datasets. Here, we provide mathematical results, suggesting that not accounting for covariates can lead to suboptimal harmonization. We propose two DNN-based covariate-aware harmonization approaches: covariate VAE (coVAE) and DeepResBat. The coVAE approach is a natural extension of cVAE by concatenating covariates and site information with site- and covariate-invariant latent representations. DeepResBat adopts a residual framework inspired by ComBat. DeepResBat first removes the effects of covariates with nonlinear regression trees, followed by eliminating site differences with cVAE. Finally, covariate effects are added back to the harmonized residuals. Using three datasets from three continents with a total of 2787 participants and 10,085 anatomical T1 scans, we find that DeepResBat and coVAE outperformed ComBat, CovBat and cVAE in terms of removing dataset differences, while enhancing biological effects of interest. However, coVAE hallucinates spurious associations between anatomical MRI and covariates even when no association exists. Future studies proposing DNN-based harmonization approaches should be aware of this false positive pitfall. Overall, our results suggest that DeepResBat is an effective deep learning alternative to ComBat. Code for DeepResBat can be found here: https://github.com/ThomasYeoLab/CBIG/tree/master/stable_projects/harmonization/An2024_DeepResBat.
从多个数据集汇总 MRI 数据需要进行协调,以减少不必要的站点间变异性,同时保留生物变量(或协变量)的效果。流行的协调方法 ComBat 使用混合效应回归框架,明确考虑了数据集之间协变量分布的差异。基于深度神经网络(DNN)的协调方法也受到了广泛关注,例如条件变分自动编码器(cVAE)。然而,当前的 DNN 方法并没有明确考虑数据集之间协变量分布的差异。在这里,我们提供了数学结果,表明不考虑协变量会导致协调效果不佳。我们提出了两种基于 DNN 的协变量感知协调方法:协变量 VAE(coVAE)和 DeepResBat。coVAE 方法是通过将协变量和站点信息与站点和协变量不变的潜在表示串联起来,对 cVAE 的自然扩展。DeepResBat 采用了受 ComBat 启发的残差框架。DeepResBat 首先使用非线性回归树去除协变量的影响,然后使用 cVAE 消除站点差异。最后,将协变量的影响添加回协调的残差中。使用来自三个大洲的三个数据集,共有 2787 名参与者和 10085 个解剖 T1 扫描,我们发现 DeepResBat 和 coVAE 在去除数据集差异方面优于 ComBat、CovBat 和 cVAE,同时增强了感兴趣的生物学效应。然而,即使在没有关联的情况下,coVAE 也会在解剖 MRI 和协变量之间产生虚假关联。未来提出基于 DNN 的协调方法的研究应注意到这一假阳性陷阱。总的来说,我们的结果表明,DeepResBat 是 ComBat 的一种有效的深度学习替代方法。DeepResBat 的代码可以在这里找到:https://github.com/ThomasYeoLab/CBIG/tree/master/stable_projects/harmonization/An2024_DeepResBat。