An Lijun, Zhang Chen, Wulan Naren, Zhang Shaoshi, Chen Pansheng, Ji Fang, Ng Kwun Kei, Chen Christopher, Zhou Juan Helen, Yeo B T Thomas
Centre for Sleep and Cognition & Centre for Translational MR Research, Yong Loo Lin School of Medicine, National University of Singapore, Singapore.
Department of Electrical and Computer Engineering, National University of Singapore, Singapore.
bioRxiv. 2024 Aug 6:2024.01.18.574145. doi: 10.1101/2024.01.18.574145.
Pooling MRI data from multiple datasets requires harmonization to reduce undesired inter-site variabilities, while preserving effects of biological variables (or covariates). The popular harmonization approach ComBat uses a mixed effect regression framework that explicitly accounts for covariate distribution differences across datasets. There is also significant interest in developing harmonization approaches based on deep neural networks (DNNs), such as conditional variational autoencoder (cVAE). However, current DNN approaches do not explicitly account for covariate distribution differences across datasets. Here, we provide mathematical results, suggesting that not accounting for covariates can lead to suboptimal harmonization. We propose two DNN-based covariate-aware harmonization approaches: covariate VAE (coVAE) and DeepResBat. The coVAE approach is a natural extension of cVAE by concatenating covariates and site information with site- and covariate-invariant latent representations. DeepResBat adopts a residual framework inspired by ComBat. DeepResBat first removes the effects of covariates with nonlinear regression trees, followed by eliminating site differences with cVAE. Finally, covariate effects are added back to the harmonized residuals. Using three datasets from three continents with a total of 2787 participants and 10085 anatomical T1 scans, we find that DeepResBat and coVAE outperformed ComBat, CovBat and cVAE in terms of removing dataset differences, while enhancing biological effects of interest. However, coVAE hallucinates spurious associations between anatomical MRI and covariates even when no association exists. Future studies proposing DNN-based harmonization approaches should be aware of this false positive pitfall. Overall, our results suggest that DeepResBat is an effective deep learning alternative to ComBat. Code for DeepResBat can be found here: https://github.com/ThomasYeoLab/CBIG/tree/master/stable_projects/harmonization/An2024_DeepResBat.
整合来自多个数据集的磁共振成像(MRI)数据需要进行协调,以减少不必要的站点间变异性,同时保留生物变量(或协变量)的影响。流行的协调方法ComBat使用混合效应回归框架,该框架明确考虑了不同数据集之间协变量分布的差异。人们也对基于深度神经网络(DNN)开发协调方法有着浓厚兴趣,比如条件变分自编码器(cVAE)。然而,当前的DNN方法没有明确考虑不同数据集之间协变量分布的差异。在此,我们给出数学结果,表明不考虑协变量会导致次优的协调效果。我们提出两种基于DNN的协变量感知协调方法:协变量VAE(coVAE)和深度残差ComBat(DeepResBat)。coVAE方法是cVAE的自然扩展,通过将协变量和站点信息与站点和协变量不变的潜在表示连接起来。DeepResBat采用受ComBat启发的残差框架。DeepResBat首先用非线性回归树消除协变量的影响,然后用cVAE消除站点差异。最后,将协变量效应加回到协调后的残差中。使用来自三大洲的三个数据集,共2787名参与者和10085次解剖T1扫描,我们发现DeepResBat和coVAE在消除数据集差异方面优于ComBat、CovBat和cVAE,同时增强了感兴趣的生物效应。然而,即使不存在关联,coVAE也会在解剖MRI和协变量之间产生虚假关联。未来提出基于DNN的协调方法的研究应注意这种假阳性陷阱。总体而言,我们的结果表明DeepResBat是ComBat的一种有效的深度学习替代方法。DeepResBat的代码可在此处找到:https://github.com/ThomasYeoLab/CBIG/tree/master/stable_projects/harmonization/An2024_DeepResBat 。