Tassi Emma, Bianchi Anna Maria, Calesella Federico, Vai Benedetta, Bellani Marcella, Nenadić Igor, Piras Fabrizio, Benedetti Francesco, Brambilla Paolo, Maggioni Eleonora
Department of Neurosciences and Mental Health, Fondazione IRCS Cà Granda Ospedale Policlinico, Milano, Italy.
Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milano, Italy.
Hum Brain Mapp. 2024 Dec 15;45(18):e70085. doi: 10.1002/hbm.70085.
Data aggregation across multiple research centers is gaining importance in the context of MRI research, driving diverse high-dimensional datasets to form large-scale heterogeneous sample, increasing statistical power and relevance of machine learning and deep learning algorithm. Site-related effects have been demonstrated to introduce bias in MRI features and confound subsequent analyses. Although Combating Batch (ComBat) technique has been recently reported to successfully harmonize multi-scale neuroimaging features, its performance assessments are still limited and largely based on qualitative visualizations and statistical analyses. In this study, we stand out by using a robust cross-validation approach to assess ComBat performances applied on volume- and surface-based measures acquired across three sites. A machine learning approach based on Multi-Class Gaussian Process Classifier was applied to predict imaging site based on raw and harmonized brain features, providing quantitative insights into ComBat effectiveness, and verifying the association between biological covariates and harmonized brain features. Our findings showed differences in terms of ComBat performances across measures of regional brain morphology, demonstrating tissue specific site effect modeling. ComBat adjustment of site effects also varied across regional level of each specific volume-based and surface-based measures. ComBat effectively eliminates unwanted data site-related variability, by maintaining or even enhancing data association with biological factors. Of note, ComBat has demonstrated flexibility and robustness of application on unseen independent gray matter volume data from the same sites.
在磁共振成像(MRI)研究背景下,跨多个研究中心的数据聚合变得愈发重要,促使多样的高维数据集形成大规模异质样本,提高了机器学习和深度学习算法的统计功效及相关性。已证实与研究地点相关的效应会在MRI特征中引入偏差,并混淆后续分析。尽管最近有报道称批处理校正(ComBat)技术成功地统一了多尺度神经影像特征,但其性能评估仍然有限,且在很大程度上基于定性可视化和统计分析。在本研究中,我们通过使用一种稳健的交叉验证方法来评估ComBat在三个研究地点采集的基于体积和表面测量数据上的性能,从而脱颖而出。基于多类高斯过程分类器的机器学习方法被用于根据原始和统一后的脑特征预测成像地点,为ComBat的有效性提供定量见解,并验证生物学协变量与统一后的脑特征之间的关联。我们的研究结果表明,ComBat在区域脑形态测量方面的性能存在差异,证明了组织特异性的地点效应建模。ComBat对地点效应的调整在每个特定的基于体积和表面测量的区域水平上也有所不同。ComBat通过维持甚至增强数据与生物学因素的关联,有效地消除了与数据地点相关的不必要变异性。值得注意的是,ComBat已证明在来自相同地点的未见独立灰质体积数据上应用时具有灵活性和稳健性。