Current browse context:
cs.CV
Change to browse by:
References & Citations
Computer Science > Computer Vision and Pattern Recognition
Title: Attribute-Guided Multi-Level Attention Network for Fine-Grained Fashion Retrieval
(Submitted on 27 Dec 2022 (v1), last revised 26 Apr 2024 (this version, v2))
Abstract: Fine-grained fashion retrieval searches for items that share a similar attribute with the query image. Most existing methods use a pre-trained feature extractor (e.g., ResNet 50) to capture image representations. However, a pre-trained feature backbone is typically trained for image classification and object detection, which are fundamentally different tasks from fine-grained fashion retrieval. Therefore, existing methods suffer from a feature gap problem when directly using the pre-trained backbone for fine-tuning. To solve this problem, we introduce an attribute-guided multi-level attention network (AG-MAN). Specifically, we first enhance the pre-trained feature extractor to capture multi-level image embedding, thereby enriching the low-level features within these representations. Then, we propose a classification scheme where images with the same attribute, albeit with different values, are categorized into the same class. This can further alleviate the feature gap problem by perturbing object-centric feature learning. Moreover, we propose an improved attribute-guided attention module for extracting more accurate attribute-specific representations. Our model consistently outperforms existing attention based methods when assessed on the FashionAI (62.8788% in MAP), DeepFashion (8.9804% in MAP), and Zappos50k datasets (93.32% in Prediction accuracy). Especially, ours improves the most typical ASENet_V2 model by 2.12%, 0.31%, and 0.78% points in FashionAI, DeepFashion, and Zappos50k datasets, respectively. The source code is available in this https URL
Submission history
From: Ling Xiao [view email][v1] Tue, 27 Dec 2022 05:28:38 GMT (20803kb,D)
[v2] Fri, 26 Apr 2024 05:27:38 GMT (4743kb,D)
Link back to: arXiv, form interface, contact.