HarmonyTM：应用于帕金森病分类分布式学习的多中心数据协调

HarmonyTM: multi-center data harmonization applied to distributed learning for Parkinson's disease classification.

作者信息

Souza Raissa, Stanley Emma A M, Gulve Vedant, Moore Jasmine, Kang Chris, Camicioli Richard, Monchi Oury, Ismail Zahinoor, Wilms Matthias, Forkert Nils D

机构信息

University of Calgary, Department of Radiology, Cumming School of Medicine, Calgary, Alberta, Canada.

University of Calgary, Hotchkiss Brain Institute, Calgary, Alberta, Canada.

出版信息

J Med Imaging (Bellingham). 2024 Sep;11(5):054502. doi: 10.1117/1.JMI.11.5.054502. Epub 2024 Sep 20.

DOI:10.1117/1.JMI.11.5.054502

PMID:39308760

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11413651/

Abstract

PURPOSE

Distributed learning is widely used to comply with data-sharing regulations and access diverse datasets for training machine learning (ML) models. The traveling model (TM) is a distributed learning approach that sequentially trains with data from one center at a time, which is especially advantageous when dealing with limited local datasets. However, a critical concern emerges when centers utilize different scanners for data acquisition, which could potentially lead models to exploit these differences as shortcuts. Although data harmonization can mitigate this issue, current methods typically rely on large or paired datasets, which can be impractical to obtain in distributed setups.

APPROACH

We introduced HarmonyTM, a data harmonization method tailored for the TM. HarmonyTM effectively mitigates bias in the model's feature representation while retaining crucial disease-related information, all without requiring extensive datasets. Specifically, we employed adversarial training to "unlearn" bias from the features used in the model for classifying Parkinson's disease (PD). We evaluated HarmonyTM using multi-center three-dimensional (3D) neuroimaging datasets from 83 centers using 23 different scanners.

RESULTS

Our results show that HarmonyTM improved PD classification accuracy from 72% to 76% and reduced (unwanted) scanner classification accuracy from 53% to 30% in the TM setup.

CONCLUSION

HarmonyTM is a method tailored for harmonizing 3D neuroimaging data within the TM approach, aiming to minimize shortcut learning in distributed setups. This prevents the disease classifier from leveraging scanner-specific details to classify patients with or without PD-a key aspect for deploying ML models for clinical applications.

摘要

目的

分布式学习被广泛用于遵守数据共享规定并访问多样化的数据集以训练机器学习（ML）模型。移动模型（TM）是一种分布式学习方法，它一次从一个中心的数据依次进行训练，在处理有限的本地数据集时特别有利。然而，当各个中心使用不同的扫描仪进行数据采集时，就会出现一个关键问题，这可能会导致模型将这些差异作为捷径来利用。尽管数据协调可以缓解这个问题，但当前的方法通常依赖于大型或配对的数据集，在分布式设置中获取这些数据集可能不切实际。

方法

我们引入了HarmonyTM，这是一种专为TM量身定制的数据协调方法。HarmonyTM有效地减轻了模型特征表示中的偏差，同时保留了关键的疾病相关信息，而且无需大量数据集。具体来说，我们采用对抗训练从用于对帕金森病（PD）进行分类的模型所使用的特征中“去除”偏差。我们使用来自83个中心、使用23种不同扫描仪的多中心三维（3D）神经影像数据集对HarmonyTM进行了评估。

结果

我们的结果表明，在TM设置中，HarmonyTM将PD分类准确率从72%提高到了76%，并将（不需要的）扫描仪分类准确率从53%降低到了30%。

结论

HarmonyTM是一种专为在TM方法中协调3D神经影像数据而量身定制的方法，旨在最大限度地减少分布式设置中的捷径学习。这可以防止疾病分类器利用特定于扫描仪的细节来对患有或未患有PD的患者进行分类，这是将ML模型应用于临床的一个关键方面。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

HarmonyTM：应用于帕金森病分类分布式学习的多中心数据协调

HarmonyTM: multi-center data harmonization applied to distributed learning for Parkinson's disease classification.

作者信息

机构信息

出版信息

PURPOSE

APPROACH

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

本文引用的文献

HarmonyTM：应用于帕金森病分类分布式学习的多中心数据协调

HarmonyTM: multi-center data harmonization applied to distributed learning for Parkinson's disease classification.

作者信息

机构信息

出版信息

PURPOSE

APPROACH

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

本文引用的文献