We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CV

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computer Vision and Pattern Recognition

Title: Attribute-Guided Multi-Level Attention Network for Fine-Grained Fashion Retrieval

Abstract: Fine-grained fashion retrieval searches for items that share a similar attribute with the query image. Most existing methods use a pre-trained feature extractor (e.g., ResNet 50) to capture image representations. However, a pre-trained feature backbone is typically trained for image classification and object detection, which are fundamentally different tasks from fine-grained fashion retrieval. Therefore, existing methods suffer from a feature gap problem when directly using the pre-trained backbone for fine-tuning. To solve this problem, we introduce an attribute-guided multi-level attention network (AG-MAN). Specifically, we first enhance the pre-trained feature extractor to capture multi-level image embedding, thereby enriching the low-level features within these representations. Then, we propose a classification scheme where images with the same attribute, albeit with different values, are categorized into the same class. This can further alleviate the feature gap problem by perturbing object-centric feature learning. Moreover, we propose an improved attribute-guided attention module for extracting more accurate attribute-specific representations. Our model consistently outperforms existing attention based methods when assessed on the FashionAI (62.8788% in MAP), DeepFashion (8.9804% in MAP), and Zappos50k datasets (93.32% in Prediction accuracy). Especially, ours improves the most typical ASENet_V2 model by 2.12%, 0.31%, and 0.78% points in FashionAI, DeepFashion, and Zappos50k datasets, respectively. The source code is available in this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Machine Learning (cs.LG)
Journal reference: IEEE Access, vol. 12, pp. 48068-48080, 2024
DOI: 10.1109/ACCESS.2024.3383785
Cite as: arXiv:2301.13014 [cs.CV]
  (or arXiv:2301.13014v2 [cs.CV] for this version)

Submission history

From: Ling Xiao [view email]
[v1] Tue, 27 Dec 2022 05:28:38 GMT (20803kb,D)
[v2] Fri, 26 Apr 2024 05:27:38 GMT (4743kb,D)

Link back to: arXiv, form interface, contact.