We took this version of HeCBench and are modifying it to build the CUDA and OMP codes to gather their roofline performance data. So far we have a large portion of the CUDA and OMP codes building ...
Quantum-inspired adaptive tiling for high-performance matrix multiplication. Uses WKB tunneling physics with the golden ratio to derive optimal tile sizes from real-time CPU state. 15%+ gains on ...
Sparse matrix-matrix multiplication (SpMM) is a crucial kernel in various applications, including sparse deep neural networks [1]–[6], graph analytics [7], triangle counting [8], and linear algebra ...
Abstract: Graph convolutional networks (GCNs) are emerging neural network models designed to process graph-structured data. Due to massively parallel computations using irregular data structures by ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results