Abstract: This paper presents a new framework for open-vocabulary semantic segmentation with the pre-trained vision-language model, named Side Adapter Network (SAN). Our approach models the semantic ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results