Liang Xudong, Xie Jiang, Wei Jinzhu, Zhang Mengfei, Zhang Haoyang
School of Computer Engineering and Science, Shanghai University, Shanghai, China.
School of Computer Engineering and Science, Shanghai University, Shanghai, China.
J Biomed Inform. 2025 Jun;166:104840. doi: 10.1016/j.jbi.2025.104840. Epub 2025 May 8.
The full fine-tuning paradigm becomes impractical when applying pre-trained models to downstream tasks due to significant computational and storage costs. Parameter-efficient fine-tuning (PEFT) methods can alleviate the issue. However, solely applying PEFT methods leads to sub-optimal performance owing to the domain gap between pre-trained models and medical downstream tasks.
This study proposes Knowledge-enhanced Parameter-efficient Transfer Learning with METER (KPL-METER) for medical vision-language (VL) downstream tasks. KPL-METER combines PEFT methods, including an innovative PEFT module for multi-modal branches and newly introduced external domain-specific knowledge to enhance model performance. First, a lightweight, plug-and-play module named Sharing Adapter (SAdapter) is developed and inserted into the multi-modal encoders. This allows the two modalities to maintain uni-modal features while encouraging cross-modal consistency. Second, a novel knowledge extraction method and a parameter-free knowledge modeling strategy are developed to incorporate domain-specific knowledge from the Unified Medical Language System (UMLS) into multi-modal features. To further enhance the modeling of uni-modal features, Adapter is added to the image and text encoders.
The effectiveness of the proposed model is evaluated on two medical VL tasks using three VL datasets. The results indicate that the KPL-METER model outperforms other PEFT methods in terms of performance while utilizing fewer parameters. Furthermore, KPL-METER-MED, which incorporates medical-tailored encoders, is developed. Compared to previous models in the medical domain, KPL-METER-MED tunes fewer parameters while generally achieving higher performance.
The proposed KPL-METER architecture effectively adapts general VL models for medical VL tasks, and the designed knowledge extraction and fusion method notably enhance performance by integrating medical domain-specific knowledge. Code is available at https://github.com/Adam-lxd/KPL-METER.