Zhang Zikai, Liu Ping, Xu Jiahao, Hu Rui
IEEE Trans Neural Netw Learn Syst. 2025 Jul 8;PP. doi: 10.1109/TNNLS.2025.3580495.
Federated learning (FL) has recently been used to collaboratively fine-tune foundation models (FMs) across multiple clients. Notably, federated low-rank adaptation (LoRA)-based fine-tuning methods have recently gained attention, which allows clients to fine-tune FMs with a small portion of trainable parameters locally. However, most existing methods do not account for the heterogeneous resources of clients or lack an effective local training strategy to maximize global fine-tuning performance under limited resources. In this work, we propose federated LoRA-based fine-tuning framework with heterogeneous LoRA allocation (Fed-HeLLo), a novel federated LoRA-based fine-tuning framework that enables clients to collaboratively fine-tune an FM with different local trainable LoRA layers. To ensure its effectiveness, we develop several heterogeneous LoRA allocation (HLA) strategies that adaptively allocate local trainable LoRA layers based on clients' resource capabilities and the layer importance. Specifically, based on the dynamic layer importance, we design a Fisher information matrix score-based HLA (FIM-HLA) that leverages dynamic gradient norm information. To better stabilize the training process, we consider the intrinsic importance of LoRA layers and design a geometrically defined HLA (GD-HLA) strategy. It shapes the collective distribution of trainable LoRA layers into specific geometric patterns, such as triangle, inverted triangle, bottleneck, and uniform. Moreover, we extend GD-HLA into a randomized version, named randomized GD-HLA (RGD-HLA), for enhanced model accuracy with randomness. By codesigning the proposed HLA strategies, we incorporate both the dynamic and intrinsic layer importance into the design of our HLA strategy. To thoroughly evaluate our approach, we simulate various complex federated LoRA-based fine-tuning settings using five datasets and three levels of data distributions ranging from independent identically distributed (i.i.d.) to extreme non-i.i.d. The experimental results demonstrate the effectiveness and efficiency of Fed-HeLLo with the proposed HLA strategies. The code is available at https://github.com/ TNI-playground/Fed_HeLLo.
联邦学习(FL)最近被用于跨多个客户端协同微调基础模型(FM)。值得注意的是,基于联邦低秩适应(LoRA)的微调方法最近受到了关注,它允许客户端在本地使用一小部分可训练参数对FM进行微调。然而,大多数现有方法没有考虑客户端的异构资源,或者缺乏有效的本地训练策略来在有限资源下最大化全局微调性能。在这项工作中,我们提出了基于异构LoRA分配的联邦LoRA微调框架(Fed-HeLLo),这是一种新颖的基于联邦LoRA的微调框架,使客户端能够使用不同的本地可训练LoRA层协同微调FM。为确保其有效性,我们开发了几种异构LoRA分配(HLA)策略,这些策略根据客户端的资源能力和层重要性自适应分配本地可训练LoRA层。具体而言,基于动态层重要性,我们设计了一种基于Fisher信息矩阵得分的HLA(FIM-HLA),它利用动态梯度范数信息。为了更好地稳定训练过程,我们考虑LoRA层的内在重要性并设计了一种几何定义的HLA(GD-HLA)策略。它将可训练LoRA层的集体分布塑造为特定的几何模式,如三角形、倒三角形、瓶颈形和均匀形。此外,我们将GD-HLA扩展为随机版本,称为随机GD-HLA(RGD-HLA),以通过随机性提高模型准确性。通过共同设计所提出的HLA策略,我们将动态和内在层重要性都纳入了HLA策略的设计中。为了全面评估我们的方法,我们使用五个数据集和从独立同分布(i.i.d.)到极端非i.i.d.的三个数据分布级别模拟了各种复杂的基于联邦LoRA的微调设置。实验结果证明了Fed-HeLLo与所提出的HLA策略的有效性和效率。代码可在https://github.com/TNI-playground/Fed_HeLLo获取。