Attribute-Guided Multi-Level Attention Network for Fine-Grained Fashion Retrieval

Xiao, Ling; Yamasaki, Toshihiko

doi:10.1109/ACCESS.2024.3383785

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2301

Computer Science > Computer Vision and Pattern Recognition

Title: Attribute-Guided Multi-Level Attention Network for Fine-Grained Fashion Retrieval

Authors: Ling Xiao, Toshihiko Yamasaki

(Submitted on 27 Dec 2022 (v1), last revised 26 Apr 2024 (this version, v2))

Abstract: Fine-grained fashion retrieval searches for items that share a similar attribute with the query image. Most existing methods use a pre-trained feature extractor (e.g., ResNet 50) to capture image representations. However, a pre-trained feature backbone is typically trained for image classification and object detection, which are fundamentally different tasks from fine-grained fashion retrieval. Therefore, existing methods suffer from a feature gap problem when directly using the pre-trained backbone for fine-tuning. To solve this problem, we introduce an attribute-guided multi-level attention network (AG-MAN). Specifically, we first enhance the pre-trained feature extractor to capture multi-level image embedding, thereby enriching the low-level features within these representations. Then, we propose a classification scheme where images with the same attribute, albeit with different values, are categorized into the same class. This can further alleviate the feature gap problem by perturbing object-centric feature learning. Moreover, we propose an improved attribute-guided attention module for extracting more accurate attribute-specific representations. Our model consistently outperforms existing attention based methods when assessed on the FashionAI (62.8788% in MAP), DeepFashion (8.9804% in MAP), and Zappos50k datasets (93.32% in Prediction accuracy). Especially, ours improves the most typical ASENet_V2 model by 2.12%, 0.31%, and 0.78% points in FashionAI, DeepFashion, and Zappos50k datasets, respectively. The source code is available in this https URL

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Machine Learning (cs.LG)
Journal reference:	IEEE Access, vol. 12, pp. 48068-48080, 2024
DOI:	10.1109/ACCESS.2024.3383785
Cite as:	arXiv:2301.13014 [cs.CV]
	(or arXiv:2301.13014v2 [cs.CV] for this version)

Submission history

From: Ling Xiao [view email]
[v1] Tue, 27 Dec 2022 05:28:38 GMT (20803kb,D)
[v2] Fri, 26 Apr 2024 05:27:38 GMT (4743kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2301.13014

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Attribute-Guided Multi-Level Attention Network for Fine-Grained Fashion Retrieval

Submission history