We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CE

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computational Engineering, Finance, and Science

Title: Predicting Accurate Hot Spots in a More Than Ten-Thousand-Core GPU with a Million-Time Speedup over FEM Enabled by a Physics-based Learning Algorithm

Abstract: The classical proper orthogonal decomposition (POD) with the Galerkin projection (GP) has been revised for chip-level thermal simulation of microprocessors with a large number of cores. An ensemble POD-GP methodology (EnPOD-GP) is introduced to significantly improve the training effectiveness and prediction accuracy by dividing a large number of heat sources into heat source blocks (HSBs) each of which may contains one or a very small number of heat sources. Although very accurate, efficient and robust to any power map, EnPOD-GP suffers from intensive training for microprocessors with an enormous number of cores. A local-domain EnPOD-GP model (LEnPOD-GP) is thus proposed to further minimize the training burden. LEnPOD-GP utilizes the concepts of local domain truncation and generic building blocks to reduce the massive training data. LEnPOD-GP has been demonstrated on thermal simulation of NVIDIA Tesla Volta GV100, a GPU with more than 13,000 cores including FP32, FP64, INT32, and Tensor Cores. Due to the domain truncation for LEnPOD-GP, the least square error (LSE) is degraded but is still as small as 1.6% over the entire space and below 1.4% in the device layer when using 4 modes per HSB. When only the maximum temperature of the entire GPU is of interest, LEnPOD-GP offers a computing speed 1.1 million times faster than the FEM with a maximum error near 1.2 degrees over the entire simulation time.
Comments: 8 pages, 8 figures
Subjects: Computational Engineering, Finance, and Science (cs.CE)
Cite as: arXiv:2404.09419 [cs.CE]
  (or arXiv:2404.09419v1 [cs.CE] for this version)

Submission history

From: Ming-Cheng Cheng [view email]
[v1] Mon, 15 Apr 2024 02:15:54 GMT (1687kb)

Link back to: arXiv, form interface, contact.