We gratefully acknowledge support from
the Simons Foundation and member institutions.

Hardware Architecture

Authors and titles for cs.AR in Mar 2024

[ total of 92 entries: 1-25 | 26-50 | 51-75 | 76-92 ]
[ showing 25 entries per page: fewer | more | all ]
[1]  arXiv:2403.00232 [pdf, other]
Title: FTTN: Feature-Targeted Testing for Numerical Properties of NVIDIA & AMD Matrix Accelerators
Subjects: Hardware Architecture (cs.AR)
[2]  arXiv:2403.00579 [pdf, other]
Title: NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM Inferencing
Comments: 16 pages, 15 figures
Subjects: Hardware Architecture (cs.AR)
[3]  arXiv:2403.00766 [pdf, other]
Title: Towards Fair and Firm Real-Time Scheduling in DNN Multi-Tenant Multi-Accelerator Systems via Reinforcement Learning
Subjects: Hardware Architecture (cs.AR); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
[4]  arXiv:2403.00849 [pdf, other]
Title: NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions
Subjects: Hardware Architecture (cs.AR); Machine Learning (cs.LG); Machine Learning (stat.ML)
[5]  arXiv:2403.01236 [pdf, other]
Title: Performance evaluation of acceleration of convolutional layers on OpenEdgeCGRA
Subjects: Hardware Architecture (cs.AR)
[6]  arXiv:2403.01351 [pdf, ps, other]
Title: Efficient FIR filtering with Bit Layer Multiply Accumulator
Authors: Vincenzo Liguori
Subjects: Hardware Architecture (cs.AR); Emerging Technologies (cs.ET)
[7]  arXiv:2403.03442 [pdf, other]
Title: CAMASim: A Comprehensive Simulation Framework for Content-Addressable Memory based Accelerators
Subjects: Hardware Architecture (cs.AR)
[8]  arXiv:2403.04189 [pdf, ps, other]
Title: Silicon Photonic 2.5D Interposer Networks for Overcoming Communication Bottlenecks in Scale-out Machine Learning Hardware Accelerators
Subjects: Hardware Architecture (cs.AR); Machine Learning (cs.LG); Signal Processing (eess.SP)
[9]  arXiv:2403.04414 [pdf, other]
Title: A methodology to automatically optimize dynamic memory managers applying grammatical evolution
Journal-ref: Journal of Systems and Software, 91, pp. 109-123, 2014
Subjects: Hardware Architecture (cs.AR)
[10]  arXiv:2403.04539 [pdf, other]
Title: PUMA: Efficient and Low-Cost Memory Allocation and Alignment Support for Processing-Using-Memory Architectures
Subjects: Hardware Architecture (cs.AR)
[11]  arXiv:2403.04635 [pdf, ps, other]
Title: Virtuoso: An Open-Source, Comprehensive and Modular Simulation Framework for Virtual Memory Research
Subjects: Hardware Architecture (cs.AR); Operating Systems (cs.OS)
[12]  arXiv:2403.04982 [pdf, other]
Title: A 28.6 mJ/iter Stable Diffusion Processor for Text-to-Image Generation with Patch Similarity-based Sparsity Augmentation and Text-based Mixed-Precision
Comments: Accepted at 2024 IEEE International Symposium on Circuits and Systems (ISCAS)
Subjects: Hardware Architecture (cs.AR)
[13]  arXiv:2403.05037 [pdf, other]
Title: Lightator: An Optical Near-Sensor Accelerator with Compressive Acquisition Enabling Versatile Image Processing
Comments: 6 pages, 10 figures
Subjects: Hardware Architecture (cs.AR); Signal Processing (eess.SP)
[14]  arXiv:2403.05465 [pdf, other]
Title: Algorithm-Hardware Co-Design of Distribution-Aware Logarithmic-Posit Encodings for Efficient DNN Inference
Comments: 2024 61st IEEE/ACM Design Automation Conference (DAC)
Subjects: Hardware Architecture (cs.AR); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
[15]  arXiv:2403.05763 [pdf, other]
Title: HDReason: Algorithm-Hardware Codesign for Hyperdimensional Knowledge Graph Reasoning
Subjects: Hardware Architecture (cs.AR); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[16]  arXiv:2403.06120 [pdf, other]
Title: I/O Transit Caching for PMem-based Block Device
Comments: Accepted by the Journal of Systems Architecture: Embedded Software Design (JSA)
Subjects: Hardware Architecture (cs.AR); Emerging Technologies (cs.ET); Operating Systems (cs.OS)
[17]  arXiv:2403.06664 [pdf, other]
Title: Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System
Comments: Published at HPCA 2024 (Best Paper Award Honorable Mention)
Subjects: Hardware Architecture (cs.AR); Machine Learning (cs.LG)
[18]  arXiv:2403.06938 [pdf, other]
Title: TCAM-SSD: A Framework for Search-Based Computing in Solid-State Drives
Subjects: Hardware Architecture (cs.AR)
[19]  arXiv:2403.07039 [pdf, ps, other]
Title: From English to ASIC: Hardware Implementation with Large Language Model
Comments: 15 pages, 1 figure
Subjects: Hardware Architecture (cs.AR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[20]  arXiv:2403.07257 [pdf, other]
Title: The Dawn of AI-Native EDA: Opportunities and Challenges of Large Circuit Models
Authors: Lei Chen (1), Yiqi Chen (2), Zhufei Chu (3), Wenji Fang (4), Tsung-Yi Ho (5), Ru Huang (2,6), Yu Huang (7), Sadaf Khan (5), Min Li (1), Xingquan Li (8), Yu Li (5), Yun Liang (2), Jinwei Liu (5), Yi Liu (5), Yibo Lin (2), Guojie Luo (2), Zhengyuan Shi (5), Guangyu Sun (2), Dimitrios Tsaras (1), Runsheng Wang (2), Ziyi Wang (5), Xinming Wei (2), Zhiyao Xie (4), Qiang Xu (5), Chenhao Xue (2), Junchi Yan (9), Jun Yang (6), Bei Yu (5), Mingxuan Yuan (1), Evangeline F.Y. Young (5), Xuan Zeng (10), Haoyi Zhang (2), Zuodong Zhang (2), Yuxiang Zhao (2), Hui-Ling Zhen (1), Ziyang Zheng (5), Binwu Zhu (5), Keren Zhu (5), Sunan Zou (2) ((1) Huawei Noah's Ark Lab, (2) Peking University, (3) Ningbo University, (4) Hong Kong University of Science and Technology, (5) The Chinese University of Hong Kong, (6) Southeast University, (7) Huawei HiSilicon, (8) Peng Cheng Laboratory, (9) Shanghai Jiao Tong University, (10) Fudan University)
Comments: The authors are ordered alphabetically. Contact: qxu@cse[dot]cuhk[dot]edu[dot]hk, gluo@pku[dot]edu[dot]cn, yuan.mingxuan@huawei[dot]com
Subjects: Hardware Architecture (cs.AR); Emerging Technologies (cs.ET)
[21]  arXiv:2403.07731 [pdf, other]
Title: Performance Analysis of Matrix Multiplication for Deep Learning on the Edge
Comments: 12 pages, 2 Tables, 6 Figures
Journal-ref: High Performance Computing. ISC High Performance 2022 International Workshops. ISC High Performance 2022. Lecture Notes in Computer Science, vol 13387. Springer, Cham
Subjects: Hardware Architecture (cs.AR)
[22]  arXiv:2403.09026 [pdf, other]
Title: FlexNN: A Dataflow-aware Flexible Deep Learning Accelerator for Energy-Efficient Edge Devices
Comments: Version 1. Work started in 2019
Subjects: Hardware Architecture (cs.AR); Neural and Evolutionary Computing (cs.NE)
[23]  arXiv:2403.09070 [pdf, other]
Title: Analytical Heterogeneous Die-to-Die 3D Placement with Macros
Subjects: Hardware Architecture (cs.AR)
[24]  arXiv:2403.09358 [pdf, other]
Title: Bandwidth-Effective DRAM Cache for GPUs with Storage-Class Memory
Comments: Published in 2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA'24)
Subjects: Hardware Architecture (cs.AR)
[25]  arXiv:2403.10538 [pdf, other]
Title: MATADOR: Automated System-on-Chip Tsetlin Machine Design Generation for Edge Applications
Subjects: Hardware Architecture (cs.AR); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[ total of 92 entries: 1-25 | 26-50 | 51-75 | 76-92 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2405, contact, help  (Access key information)