Breaking through AI’s memory wall with token warehousing

VB EventVB Staff January 15, 2026 Shimon Ben-David, CTO, WEKA and Matt Marshall, Founder & CEO, VentureBeatAs agentic AI moves from experiments to real production workloads, a quiet but serious infrastructure problem is coming into focus: memory. Not compute. Not models. Memory.Under the hood, today’s GPUs simply don’t have enough space to hold the Key-Value (KV) caches that modern, long-running AI agents depend on to maintain context. The result is a lot of invisible waste — GPUs redoing work they’ve already done, cloud costs climbing, and performance taking a hit. It’s a problem that’s already showing up in production environments, even if most people haven’t named it yet.At a recent stop on the VentureBeat AI Impact Series, WEKA CTO Shimon Ben-David joined…

Read more on VentureBeat