We took this version of HeCBench and are modifying it to build the CUDA and OMP codes to gather their roofline performance data. So far we have a large portion of the CUDA and OMP codes building ...
Abstract: Foundation models (FMs) have revolutionized generative AI (GAI) lifecycle with their pre-trained intelligence capabilities. While the recent success of Web-based models like GPT-4 has ...
Diffusion Transformers (DiTs) are driving advancements in high-quality image and video generation. With the escalating input context length in DiTs, the computational demand of the Attention mechanism ...
Abstract: As the brain-like intelligence develops rapidly, it is urgent to design a more convenient and efficient control framework to cope with the challenge of processing multisensory signals in ...