We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.AR

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Hardware Architecture

Title: FLAASH: Flexible Accelerator Architecture for Sparse High-Order Tensor Contraction

Abstract: Tensors play a vital role in machine learning (ML) and often exhibit properties best explored while maintaining high-order. Efficiently performing ML computations requires taking advantage of sparsity, but generalized hardware support is challenging. This paper introduces FLAASH, a flexible and modular accelerator design for sparse tensor contraction that achieves over 25x speedup for a deep learning workload. Our architecture performs sparse high-order tensor contraction by distributing sparse dot products, or portions thereof, to numerous Sparse Dot Product Engines (SDPEs). Memory structure and job distribution can be customized, and we demonstrate a simple approach as a proof of concept. We address the challenges associated with control flow to navigate data structures, high-order representation, and high-sparsity handling. The effectiveness of our approach is demonstrated through various evaluations, showcasing significant speedup as sparsity and order increase.
Comments: 10 pages, 3 figures
Subjects: Hardware Architecture (cs.AR); Machine Learning (cs.LG)
Cite as: arXiv:2404.16317 [cs.AR]
  (or arXiv:2404.16317v1 [cs.AR] for this version)

Submission history

From: Lizhong Chen [view email]
[v1] Thu, 25 Apr 2024 03:46:53 GMT (2593kb,D)

Link back to: arXiv, form interface, contact.