MIT researchers developed Attention Matching, a KV cache compaction technique that compresses LLM memory by 50x in seconds — ...
AI infrastructure can't evolve as fast as model innovation. Memory architecture is one of the few levers capable of accelerating deployment cycles. Enter SOCAMM2 ...
LLC, positioned between external memory and internal subsystems, stores frequently accessed data close to compute resources.
News highlights: 1/3 the power consumption and 1/3 smaller footprint versus standard RDIMMs — enabled by the industry's first monolithic 32Gb ...
Lightbits Labs Ltd. today is introducing a new architecture aimed at addressing one of the most stubborn bottlenecks in large-scale artificial intelligence inference: the growing mismatch between the ...
Redis, the company behind the popular in-memory data store, which is often used as a cache, vector database or streaming engine, today announced the launch of Redis 8. With this release, the company ...
A technical paper titled “HMComp: Extending Near-Memory Capacity using Compression in Hybrid Memory” was published by researchers at Chalmers University of Technology and ZeroPoint Technologies.
As the demand for real-time data processing escalates, the technology behind Compute Express Link, known as CXL, is emerging as a critical solution for modern data centers. CXL memory is one solution ...
At the Huawei Product & Solution Launch during MWC Barcelona 2026, Yuan Yuan, President of Huawei Data Storage Product Line, officially launched Huawei's AI Data Platform. The platform integrates ...
A novel Linux Kernel cross-cache attack named SLUBStick has a 99% success in converting a limited heap vulnerability into an arbitrary memory read-and-write capability, letting the researchers elevate ...
Modern multicore systems demand sophisticated strategies to manage shared cache resources. As multiple cores execute diverse workloads concurrently, cache interference can lead to significant ...