协同训练机器学习模型的数据集审核方法。

A Dataset Auditing Method for Collaboratively Trained Machine Learning Models.

出版信息

IEEE Trans Med Imaging. 2023 Jul;42(7):2081-2090. doi: 10.1109/TMI.2022.3220706. Epub 2023 Jun 30.

DOI:10.1109/TMI.2022.3220706

Abstract

Dataset auditing for machine learning (ML) models is a method to evaluate if a given dataset is used in training a model. In a Federated Learning setting where multiple institutions collaboratively train a model with their decentralized private datasets, dataset auditing can facilitate the enforcement of regulations, which provide rules for preserving privacy, but also allow users to revoke authorizations and remove their data from collaboratively trained models. This paper first proposes a set of requirements for a practical dataset auditing method, and then present a novel dataset auditing method called Ensembled Membership Auditing ( EMA ). Its key idea is to leverage previously proposed Membership Inference Attack methods and to aggregate data-wise membership scores using statistic testing to audit a dataset for a ML model. We have experimentally evaluated the proposed approach with benchmark datasets, as well as 4 X-ray datasets (CBIS-DDSM, COVIDx, Child-XRay, and CXR-NIH) and 3 dermatology datasets (DERM7pt, HAM10000, and PAD-UFES-20). Our results show that EMA meet the requirements substantially better than the previous state-of-the-art method. Our code is at:https://github.com/Hazelsuko07/EMA.

摘要

数据集审核是一种评估给定数据集是否用于训练模型的方法。在联邦学习环境中，多个机构使用其分散的私有数据集共同训练模型，数据集审核可以促进法规的执行，这些法规规定了保护隐私的规则，同时也允许用户撤销授权并从共同训练的模型中删除他们的数据。本文首先提出了一套实用的数据集审核方法的要求，然后提出了一种名为集成成员审核（EMA）的新数据集审核方法。其核心思想是利用先前提出的成员推断攻击方法，并使用统计检验来汇总数据级别的成员分数，以审核 ML 模型的数据集。我们使用基准数据集以及 4 个 X 射线数据集（CBIS-DDSM、COVIDx、Child-XRay 和 CXR-NIH）和 3 个皮肤科数据集（DERM7pt、HAM10000 和 PAD-UFES-20）对所提出的方法进行了实验评估。我们的结果表明，EMA 满足要求的程度明显优于以前的最先进方法。我们的代码在：https://github.com/Hazelsuko07/EMA。

相似文献

A Dataset Auditing Method for Collaboratively Trained Machine Learning Models.协同训练机器学习模型的数据集审核方法。

IEEE Trans Med Imaging. 2023 Jul;42(7):2081-2090. doi: 10.1109/TMI.2022.3220706. Epub 2023 Jun 30.

Decentralised, collaborative, and privacy-preserving machine learning for multi-hospital data.去中心化、协作和保护隐私的机器学习，适用于多医院数据。

EBioMedicine. 2024 Mar;101:105006. doi: 10.1016/j.ebiom.2024.105006. Epub 2024 Feb 19.

Generalized genomic data sharing for differentially private federated learning.用于差分隐私联邦学习的广义基因组数据共享

J Biomed Inform. 2022 Aug;132:104113. doi: 10.1016/j.jbi.2022.104113. Epub 2022 Jun 9.

Privacy-preserving Model Training for Disease Prediction Using Federated Learning with Differential Privacy.基于联邦学习与差分隐私的疾病预测隐私保护模型训练。

Annu Int Conf IEEE Eng Med Biol Soc. 2022 Jul;2022:1358-1361. doi: 10.1109/EMBC48229.2022.9871742.

COVID-19 detection using federated machine learning.使用联邦机器学习进行 COVID-19 检测。

PLoS One. 2021 Jun 8;16(6):e0252573. doi: 10.1371/journal.pone.0252573. eCollection 2021.

Federated Learning on Clinical Benchmark Data: Performance Assessment.基于临床基准数据的联邦学习：性能评估。

J Med Internet Res. 2020 Oct 26;22(10):e20891. doi: 10.2196/20891.

PrivaTree: Collaborative Privacy-Preserving Training of Decision Trees on Biomedical Data.PrivaTree：在生物医学数据上协同进行隐私保护的决策树训练。

IEEE/ACM Trans Comput Biol Bioinform. 2024 Jan-Feb;21(1):1-13. doi: 10.1109/TCBB.2023.3286274. Epub 2024 Feb 5.

Multi-site fMRI analysis using privacy-preserving federated learning and domain adaptation: ABIDE results.使用隐私保护联邦学习和域适应的多站点功能磁共振成像分析：ABIDE研究结果

Med Image Anal. 2020 Oct;65:101765. doi: 10.1016/j.media.2020.101765. Epub 2020 Jul 2.

A unified method to revoke the private data of patients in intelligent healthcare with audit to forget.使用带审计的遗忘技术，对智能医疗保健中患者的私人数据进行统一撤销。

Nat Commun. 2023 Oct 6;14(1):6255. doi: 10.1038/s41467-023-41703-x.

MemberShield: A framework for federated learning with membership privacy.成员护盾：一种具有成员隐私性的联邦学习框架。

Neural Netw. 2025 Jan;181:106768. doi: 10.1016/j.neunet.2024.106768. Epub 2024 Oct 1.

引用本文的文献

Federated learning for medical image analysis: A survey.用于医学图像分析的联邦学习：一项综述。

Pattern Recognit. 2024 Jul;151. doi: 10.1016/j.patcog.2024.110424. Epub 2024 Mar 12.

SkinLesNet: Classification of Skin Lesions and Detection of Melanoma Cancer Using a Novel Multi-Layer Deep Convolutional Neural Network.皮肤病变网络（SkinLesNet）：使用新型多层深度卷积神经网络对皮肤病变进行分类及检测黑色素瘤

Cancers (Basel). 2023 Dec 24;16(1):108. doi: 10.3390/cancers16010108.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

协同训练机器学习模型的数据集审核方法。

A Dataset Auditing Method for Collaboratively Trained Machine Learning Models.

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献