Parameter Efficient Fine-tuning via Cross Block Orchestration for Segment Anything Model

Peng, Zelin; Xu, Zhengqin; Zeng, Zhilin; Xie, Lingxi; Tian, Qi; Shen, Wei

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2311

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: Parameter Efficient Fine-tuning via Cross Block Orchestration for Segment Anything Model

Authors: Zelin Peng, Zhengqin Xu, Zhilin Zeng, Lingxi Xie, Qi Tian, Wei Shen

(Submitted on 28 Nov 2023 (v1), last revised 28 Mar 2024 (this version, v2))

Abstract: Parameter-efficient fine-tuning (PEFT) is an effective methodology to unleash the potential of large foundation models in novel scenarios with limited training data. In the computer vision community, PEFT has shown effectiveness in image classification, but little research has studied its ability for image segmentation. Fine-tuning segmentation models usually require a heavier adjustment of parameters to align the proper projection directions in the parameter space for new scenarios. This raises a challenge to existing PEFT algorithms, as they often inject a limited number of individual parameters into each block, which prevents substantial adjustment of the projection direction of the parameter space due to the limitation of Hidden Markov Chain along blocks. In this paper, we equip PEFT with a cross-block orchestration mechanism to enable the adaptation of the Segment Anything Model (SAM) to various downstream scenarios. We introduce a novel inter-block communication module, which integrates a learnable relation matrix to facilitate communication among different coefficient sets of each PEFT block's parameter space. Moreover, we propose an intra-block enhancement module, which introduces a linear projection head whose weights are generated from a hyper-complex layer, further enhancing the impact of the adjustment of projection directions on the entire parameter space. Extensive experiments on diverse benchmarks demonstrate that our proposed approach consistently improves the segmentation performance significantly on novel scenarios with only around 1K additional parameters.

Comments:	Accepted by CVPR2024
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2311.17112 [cs.CV]
	(or arXiv:2311.17112v2 [cs.CV] for this version)

Submission history

From: Zelin Peng [view email]
[v1] Tue, 28 Nov 2023 11:23:34 GMT (1223kb,D)
[v2] Thu, 28 Mar 2024 16:51:18 GMT (1209kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2311.17112

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Parameter Efficient Fine-tuning via Cross Block Orchestration for Segment Anything Model

Submission history