克服多站点功能磁共振成像研究中的位点变异性：一种用于增强机器学习模型通用性的自动编码器框架。

Overcoming Site Variability in Multisite fMRI Studies: an Autoencoder Framework for Enhanced Generalizability of Machine Learning Models.

作者信息

Almuqhim Fahad, Saeed Fahad

机构信息

Knight Foundation School of Computing and Information Sciences (KFSCIS), Florida International University, Miami, FL, USA.

Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, Saudi Arabia.

出版信息

Neuroinformatics. 2025 Sep 2;23(3):46. doi: 10.1007/s12021-025-09746-1.

DOI:10.1007/s12021-025-09746-1

PMID:40892131

Abstract

Harmonizing multisite functional magnetic resonance imaging (fMRI) data is crucial for eliminating site-specific variability that hinders the generalizability of machine learning models. Traditional harmonization techniques, such as ComBat, depend on additive and multiplicative factors, and may struggle to capture the non-linear interactions between scanner hardware, acquisition protocols, and signal variations between different imaging sites. In addition, these statistical techniques require data from all the sites during their model training which may have the unintended consequence of data leakage for ML models trained using this harmonized data. The ML models trained using this harmonized data may result in low reliability and reproducibility when tested on unseen data sets, limiting their applicability for general clinical usage. In this study, we propose Autoencoders (AEs) as an alternative for harmonizing multisite fMRI data. Our designed and developed framework leverages the non-linear representation learning capabilities of AEs to reduce site-specific effects while preserving biologically meaningful features. Our evaluation using Autism Brain Imaging Data Exchange I (ABIDE-I) dataset, containing 1,035 subjects collected from 17 centers demonstrates statistically significant improvements in leave-one-site-out (LOSO) cross-validation evaluations. All AE variants (AE, SAE, TAE, and DAE) significantly outperformed the baseline mode (p < 0.01), with mean accuracy improvements ranging from 3.41% to 5.04%. Our findings demonstrate the potential of AEs to harmonize multisite neuroimaging data effectively enabling robust downstream analyses across various neuroscience applications while reducing data-leakage, and preservation of neurobiological features. Our open-source code is made available at https://github.com/pcdslab/Autoencoder-fMRI-Harmonization .

摘要

协调多站点功能磁共振成像（fMRI）数据对于消除特定站点的变异性至关重要，这种变异性会阻碍机器学习模型的通用性。传统的协调技术，如ComBat，依赖于加法和乘法因子，可能难以捕捉扫描仪硬件、采集协议以及不同成像站点之间信号变化之间的非线性相互作用。此外，这些统计技术在模型训练期间需要来自所有站点的数据，这可能会对使用这种协调后的数据训练的机器学习模型产生数据泄露的意外后果。使用这种协调后的数据训练的机器学习模型在对未见数据集进行测试时可能导致可靠性和可重复性较低，限制了它们在一般临床应用中的适用性。在本研究中，我们提出使用自动编码器（AE）作为协调多站点fMRI数据的替代方法。我们设计并开发的框架利用AE的非线性表示学习能力来减少特定站点的影响，同时保留生物学上有意义的特征。我们使用自闭症脑成像数据交换I（ABIDE-I）数据集进行评估，该数据集包含从17个中心收集的1035名受试者，结果表明在留一站点法（LOSO）交叉验证评估中有统计学上的显著改进。所有AE变体（AE、SAE、TAE和DAE）均显著优于基线模式（p < 0.01），平均准确率提高范围为3.41%至5.04%。我们的研究结果表明，AE有潜力有效协调多站点神经成像数据，从而在各种神经科学应用中实现强大的下游分析，同时减少数据泄露并保留神经生物学特征。我们的开源代码可在https://github.com/pcdslab/Autoencoder-fMRI-Harmonization获取。

相似文献

Overcoming Site Variability in Multisite fMRI Studies: an Autoencoder Framework for Enhanced Generalizability of Machine Learning Models.克服多站点功能磁共振成像研究中的位点变异性：一种用于增强机器学习模型通用性的自动编码器框架。

Neuroinformatics. 2025 Sep 2;23(3):46. doi: 10.1007/s12021-025-09746-1.

Prescription of Controlled Substances: Benefits and Risks管制药品的处方：益处与风险

Style transfer generative adversarial networks to harmonize multisite MRI to a single reference image to avoid overcorrection.风格迁移生成对抗网络将多站点 MRI 调和到单个参考图像，以避免过度矫正。

Hum Brain Mapp. 2023 Oct 1;44(14):4875-4892. doi: 10.1002/hbm.26422. Epub 2023 Jul 20.

Enhancing Autism Spectrum Disorder identification in multi-site MRI imaging: A multi-head cross-attention and multi-context approach for addressing variability in un-harmonized data.增强多站点 MRI 成像中的自闭症谱系障碍识别：一种多头交叉注意力和多上下文方法，用于解决非协调数据中的可变性。

Artif Intell Med. 2024 Nov;157:102998. doi: 10.1016/j.artmed.2024.102998. Epub 2024 Oct 16.

Neuroimaging-based classification of PTSD using data-driven computational approaches: A multisite big data study from the ENIGMA-PGC PTSD consortium.基于神经影像学的 PTSD 分类：来自 ENIGMA-PGC PTSD 联盟的多中心大数据研究

Neuroimage. 2023 Dec 1;283:120412. doi: 10.1016/j.neuroimage.2023.120412. Epub 2023 Oct 18.

SFPGCL: Specificity-preserving federated population graph contrastive learning for multi-site ASD identification using rs-fMRI data.SFPGCL：使用静息态功能磁共振成像数据进行多站点自闭症谱系障碍识别的特异性保持联邦群体图对比学习

Comput Med Imaging Graph. 2025 Sep;124:102558. doi: 10.1016/j.compmedimag.2025.102558. Epub 2025 May 16.

MarkVCID cerebral small vessel consortium: I. Enrollment, clinical, fluid protocols.马克 VCID 脑小血管联盟：一、入组、临床、液体方案。

Alzheimers Dement. 2021 Apr;17(4):704-715. doi: 10.1002/alz.12215. Epub 2021 Jan 21.

MarkVCID cerebral small vessel consortium: II. Neuroimaging protocols.马克 VCID 脑小血管联盟：二、神经影像学协议。

Alzheimers Dement. 2021 Apr;17(4):716-725. doi: 10.1002/alz.12216. Epub 2021 Jan 21.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

Machine learning with multiple modalities of brain magnetic resonance imaging data to identify the presence of bipolar disorder.基于脑磁共振成像多模态数据的机器学习识别双相障碍。

J Affect Disord. 2025 Jan 1;368:448-460. doi: 10.1016/j.jad.2024.09.025. Epub 2024 Sep 14.

本文引用的文献

DeepComBat: A statistically motivated, hyperparameter-robust, deep learning approach to harmonization of neuroimaging data.DeepComBat：一种基于统计学的、超参数稳健的、深度学习方法，用于神经影像学数据的调和。

Hum Brain Mapp. 2024 Aug 1;45(11):e26708. doi: 10.1002/hbm.26708.

Efficacy of MRI data harmonization in the age of machine learning: a multicenter study across 36 datasets.基于机器学习的 MRI 数据调和功效：36 个数据集的多中心研究。

Sci Data. 2024 Jan 23;11(1):115. doi: 10.1038/s41597-023-02421-7.

Effect of data harmonization of multicentric dataset in ASD/TD classification.多中心数据集数据整合在自闭症谱系障碍/典型发育分类中的作用。

Brain Inform. 2023 Nov 25;10(1):32. doi: 10.1186/s40708-023-00210-x.

Sample size requirement for achieving multisite harmonization using structural brain MRI features.实现结构脑 MRI 特征多站点协调所需的样本量要求。

Neuroimage. 2022 Dec 1;264:119768. doi: 10.1016/j.neuroimage.2022.119768. Epub 2022 Nov 24.

A deep learning-based multisite neuroimage harmonization framework established with a traveling-subject dataset.基于旅行数据集建立的基于深度学习的多站点神经影像调和框架。

Neuroimage. 2022 Aug 15;257:119297. doi: 10.1016/j.neuroimage.2022.119297. Epub 2022 May 12.

Reproducible brain-wide association studies require thousands of individuals.可复制的全脑关联研究需要数千人参与。

Nature. 2022 Mar;603(7902):654-660. doi: 10.1038/s41586-022-04492-9. Epub 2022 Mar 16.

Mitigating site effects in covariance for machine learning in neuroimaging data.减轻神经影像学数据中机器学习协方差中的站点效应。

Hum Brain Mapp. 2022 Mar;43(4):1179-1195. doi: 10.1002/hbm.25688. Epub 2021 Dec 14.

Effect of data leakage in brain MRI classification using 2D convolutional neural networks.二维卷积神经网络在脑 MRI 分类中数据泄露的影响。

Sci Rep. 2021 Nov 19;11(1):22544. doi: 10.1038/s41598-021-01681-w.

Deep learning-based unlearning of dataset bias for MRI harmonisation and confound removal.基于深度学习的数据集偏差去偏方法用于 MRI 配准和混杂因素去除。

Neuroimage. 2021 Mar;228:117689. doi: 10.1016/j.neuroimage.2020.117689. Epub 2020 Dec 30.

Longitudinal ComBat: A method for harmonizing longitudinal multi-scanner imaging data.纵向 ComBat：一种协调纵向多扫描仪成像数据的方法。

Neuroimage. 2020 Oct 15;220:117129. doi: 10.1016/j.neuroimage.2020.117129. Epub 2020 Jul 5.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

克服多站点功能磁共振成像研究中的位点变异性：一种用于增强机器学习模型通用性的自动编码器框架。

Overcoming Site Variability in Multisite fMRI Studies: an Autoencoder Framework for Enhanced Generalizability of Machine Learning Models.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献