If you’ve ever wanted to know what happens when you erase technical debt, legacy architecture, and power constraints and start building AI infrastructure from scratch in 2025, VAST Data CEO and founder Renen Hallak is your guy.
So, as you can see in the interview below, we started there—with the hardest question first.
If you had no baggage, no racks full of yesterday’s decisions, how would you build an AI-native cloud? What would you include, what would you leave behind?
“I’d start with a power plant,” he said unequivocally.
Not a hypervisor. Not a fleet of CPUs. A power plant.
Because here’s the truth: AI isn’t constrained by compute anymore. It’s constrained by power. Literal megawatts.
Forget racks of pizza-box servers—he’s talking about 500kW racks jammed with GPU servers on one side, and what he calls “very unintelligent JBOFs” (just a bunch of flash) on the other. All of it lashed together with ultra-fast 400/800Gb Ethernet and NVMe-over-Fabrics, forming the skeletal infrastructure for something far more interesting.
It’s not about speed and storage. It’s about creating the habitat for a new species.
The Age of the Agent
We are, according to Hallak, building not just clouds or datacenters, but ecosystems for entities—autonomous agents—that will live, interact, learn, forget, and re-contextualize.
The term “application” doesn’t hold. These aren’t stateless requests bouncing off microservices. These are things that remember. That form relationships. That learn over time, reference the past, and make decisions based on it. That evolve.
“Some will be disembodied—interfacing through text,” he said. “Others will live in robots, in drones, in vehicles. But they’ll all need memory. Context. A way to retrieve and cross-correlate data over time.”
What Hallak describes isn’t an app. It’s a being.
This is the world McKinsey breathlessly describes in its reports, the one Meta and OpenAI hint at with agents that don’t just complete tasks but carry goals forward across time. But in Hallak’s view, this isn’t just a product opportunity—it’s an infrastructure crisis.
Because our current systems—siloed data lakes, warehouses, cloud-native “AI stacks” strung together with YAML and hope—weren’t built for agents. They can’t handle persistent memory across billions of interactions. They can’t deliver secure, real-time inference at the edge, on-device, and everywhere in between. They aren’t built for mutual agent communication, auditability, observability, or the kind of long-horizon decision-making that actually mimics cognition.
In other words: agents are coming, and our infrastructure is fundamentally unprepared.
Inference Eats the World
We’ve spent nearly a decade talking about training AI models. It’s expensive, specialized, GPU-bound work—the kind of thing done in hyperscale datacenters and national labs. But that’s not where the future lies, says Hallak.
“Training is limited,” he explains. “It’s like evolution—millions of years to get to the baseline. But most of the action is inference. That’s the loop we all run: observe, act, learn, repeat.”
Fine-tuning and inference are the new default workloads, and they’re anything but traditional. They require low latency, distributed infrastructure that doesn’t just serve up predictions—it makes decisions in real time, in context, often in physical space. And when agents start talking to each other—when they coordinate, debate, negotiate—that demand becomes explosive.
Inference becomes a form of social computation. And the datacenter? It becomes a nervous system.
The Limits of Legacy
We’ve called it a data platform. A lakehouse. A warehouse. These metaphors are done. They were always rooted in a tabular view of the world—structured, batch-oriented, analytic. But the new world is made of unstructured media: video, audio, language, biological data, events, interactions. It’s not sitting in tables. It’s flying at you at the speed of light from billions of sensors.
“The old systems were built for columns and queries,” Hallak said. “But this—this is about raw data, experiences, memories, and inference cycles that can’t afford to wait.”
And perhapsmost damning: old platforms were built to serve CPUs. This new world is GPU-native. It assumes parallelism, vectorization, memory saturation. It assumes agents will want to reason with video, not just text. And that they’ll need to do it from inside cars, drones, phones—on the edge of the edge.
So what replaces the platform?
An Operating System for AI
Operating systems, Hallak reminded me, are creatures of their era.
DOS, Windows, Mac? They were about abstracting CPU. Android, iOS? About hiding mobile hardware complexity behind friendly APIs. In the internet era, it was the browser and Cisco switches—abstracting networks. But now?
“Now we need something to abstract agents,” Hallak argued.
Not just agent execution, but their memory, their context, their observability, their communication, their control. Not just secure enclave execution, but history tracking. Not just I/O, but causality chains. A way to know which agent spoke to whom, and why, and what happened as a result.
“They need a place to live,” he said. “A habitat.”
It’s the word that stuck with me the most. Habitat. Not just a workload, not just a VM, not even a container. A space where intelligent entities persist, where thoughts are stored, where interactions leave trails, and where a human—or a system—can trace those paths back to their origin. A habitat for cognition at scale.
The World After Compute
At the very end of our conversation, I asked him to look ahead—not one year, not two, but ten. He paused.
“Ten years ago, it was easier,” he admitted. “But now? Everything is going to change. Not just how we work, but how we think, how we assign value, how we define economies.”
Because once we cross the line—once we have thinking systems that don’t just echo what’s known but generate what’s unknown—the game changes. And if those systems are running on infrastructure we built from scratch, designed for their minds, not ours… well.
Evolution is glacial. But technology moves fast.