Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul, Korea.
Department of Nuclear Medicine, Seoul National University College of Medicine, Seoul, Korea.
Sci Rep. 2024 Sep 30;14(1):22729. doi: 10.1038/s41598-024-73863-1.
Enhancing deep learning performance requires extensive datasets. Centralized training raises concerns about data ownership and security. Additionally, large models are often unsuitable for hospitals due to their limited resource capacities. Federated learning (FL) has been introduced to address these issues. However, FL faces challenges such as vulnerability to attacks, non-IID data, reliance on a central server, high communication overhead, and suboptimal model aggregation. Furthermore, FL is not optimized for realistic hospital database environments, where data are dynamically accumulated. To overcome these limitations, we propose federated influencer learning (FIL) as a secure and efficient collaborative learning paradigm. Unlike the server-client model of FL, FIL features an equal-status structure among participants, with an administrator overseeing the overall process. FIL comprises four stages: local training, qualification, screening, and influencing. Local training is similar to vanilla FL, except for the optional use of a shared dataset. In the qualification stage, participants are classified as influencers or followers. During the screening stage, the integrity of the logits from the influencer is examined. If the integrity is confirmed, the influencer shares their knowledge with the others. FIL is more secure than FL because it eliminates the need for model-parameter transactions, central servers, and generative models. Additionally, FIL supports model-agnostic training. These features make FIL particularly promising for fields such as healthcare, where maintaining confidentiality is crucial. Our experiments demonstrated the effectiveness of FIL, which outperformed several FL methods on large medical (X-ray, MRI, and PET) and natural (CIFAR-10) image dataset in a dynamically accumulating database environment, with consistently higher precision, recall, Dice score, and lower standard deviation between participants. In particular, in the PET dataset, FIL achieved about a 40% improvement in Dice score and recall.
增强深度学习性能需要广泛的数据。集中式训练引发了对数据所有权和安全性的担忧。此外,由于资源容量有限,大型模型通常不适合医院使用。联邦学习 (FL) 的出现解决了这些问题。然而,FL 面临着一些挑战,如易受攻击、非独立同分布数据、依赖中央服务器、高通信开销和次优模型聚合。此外,FL 没有针对医院数据库环境中的实际情况进行优化,这些环境中数据是动态积累的。为了克服这些限制,我们提出联邦影响者学习 (FIL) 作为一种安全有效的协作学习范例。与 FL 的服务器-客户端模型不同,FIL 参与者之间采用平等地位的结构,由管理员监督整个过程。FIL 包括四个阶段:本地训练、资格认证、筛选和影响。本地训练类似于普通的 FL,只是可选地使用共享数据集。在资格认证阶段,参与者被分类为影响者或跟随者。在筛选阶段,检查来自影响者的 logits 的完整性。如果完整性得到确认,影响者会与其他人共享他们的知识。FIL 比 FL 更安全,因为它消除了对模型参数交易、中央服务器和生成模型的需求。此外,FIL 支持与模型无关的训练。这些功能使 FIL 在医疗保健等领域特别有前途,在这些领域中,保持机密性至关重要。我们的实验证明了 FIL 的有效性,它在动态积累的数据库环境中,在大型医学(X 射线、MRI 和 PET)和自然(CIFAR-10)图像数据集上,优于几种 FL 方法,参与者之间的精度、召回率、Dice 得分和标准差都更高。特别是在 PET 数据集上,FIL 的 Dice 得分和召回率提高了约 40%。