We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.AR

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Hardware Architecture

Title: PIVOT- Input-aware Path Selection for Energy-efficient ViT Inference

Abstract: The attention module in vision transformers(ViTs) performs intricate spatial correlations, contributing significantly to accuracy and delay. It is thereby important to modulate the number of attentions according to the input feature complexity for optimal delay-accuracy tradeoffs. To this end, we propose PIVOT - a co-optimization framework which selectively performs attention skipping based on the input difficulty. For this, PIVOT employs a hardware-in-loop co-search to obtain optimal attention skip configurations. Evaluations on the ZCU102 MPSoC FPGA show that PIVOT achieves 2.7x lower EDP at 0.2% accuracy reduction compared to LVViT-S ViT. PIVOT also achieves 1.3% and 1.8x higher accuracy and throughput than prior works on traditional CPUs and GPUs. The PIVOT project can be found at this https URL
Comments: Accepted to 61st ACM/IEEE Design Automation Conference (DAC '24), June 23--27, 2024, San Francisco, CA, USA (6 Pages)
Subjects: Hardware Architecture (cs.AR)
DOI: 10.1145/3649329.3655679
Cite as: arXiv:2404.15185 [cs.AR]
  (or arXiv:2404.15185v1 [cs.AR] for this version)

Submission history

From: Abhishek Moitra [view email]
[v1] Wed, 10 Apr 2024 17:56:41 GMT (10790kb,D)

Link back to: arXiv, form interface, contact.