Abstract: This paper presents a new framework for open-vocabulary semantic segmentation with the pre-trained vision-language model, named Side Adapter Network (SAN). Our approach models the semantic ...
Abstract: This article concentrates on open-vocabulary semantic segmentation, where a well optimized model is able to segment arbitrary categories that appear in an image. To achieve this goal, we ...