The Wall Between Data and Compute Is Coming Down

Authored by

Jim Crook - Director, Corporate Communications

The story of corporate computing has always been a story of separation. On one side of the data center sits your data - the secure, massive repository of everything your company knows. On the other side sits your compute power - the processors and graphics cards that actually do the thinking.

Whenever an organization wants to gain an insight, run a prediction, or process a piece of information, it packages that data up, pulls it out of storage, and ships it across a congested network to the computers.

That worked fine when we were just running batch jobs on old-school data lakes. But today we’re staring down the barrel of agentic computing. Think millions of AI agents slamming systems with millions of real-time video streams, unstructured documents, and transactional data every single second.

Try feeding that level of computational intensity into a legacy architecture and the resulting east-west traffic grinds activity to a halt - not to mention the increased complexity, costs, and security risks.

Enterprises are approaching a fundamental shift in how computing systems are designed. As VAST Data sees it, the answer is deceptively simple: stop moving data to the computers and start bringing compute directly to the data.

The Rise of the Living Data Layer

VAST sits at the core of this infrastructure shift, and Simon Golan, VAST’s data solutions team lead, noted in a recent talk that the VAST DataEngine in particular presents a new framework with massive implications for enterprise computing.

Golan describes the DataEngine as a “real-time data flow computer.”

"What we mean is bringing the compute into your data - bringing the compute right next to your storage file system - and making it basically much smarter, much more actionable, and much more achievable for different kinds of goals," he said.

How’s that work with the VAST DataEngine? As Golan described it, the moment a file, video frame, or customer transaction lands in the data layer, the system can run the necessary business logic on it instantly. Instead of orchestrating complex batch migration jobs, data pipelines run programmatically at a sub-millisecond, sub-second layer the exact moment data lands.

It’s powerful technology and the practical implications are significant. Data teams no longer have to spend weeks configuring complex data pipelines, networks, and servers. They simply deploy the code into the data layer, and the system handles the scaling, resilience, and processing automatically.

python

from vast_runtime.vast_event import VastEvent

def init(ctx):
    """Initialize - runs once per function instance"""
    with ctx.tracer.start_as_current_span("Function Init"):
        settings = Settings.from_ctx_secrets(ctx.secrets)
        ctx.client = MyClient(settings)
        ctx.logger.info("Function initialized")

def handler(ctx, event: VastEvent):
    """Handler - processes each event"""
    with ctx.tracer.start_as_current_span("Handler") as span:
        data = event.get_data()
        span.set_attribute("event_type", event.get_type())
        ctx.logger.info(f"Processing: {data.get('filename')}")
        result = ctx.client.process(data)
        return result

Example of writing and containerizing serverless functions

Software engineers, data scientists, and pipeline managers can collaborate on the same data workflows in one unified environment. And rather than stitching together external monitoring tools to figure out why a data pipeline failed, teams get native, granular tracking built right into the data layer that allows them to trace exactly how much time it took to process an individual asset.

Real-Time Vision: The Smart City Blueprint

To understand what this looks like in practice, consider the city-scale video reasoning blueprint Golan showcased during the presentation. In a traditional enterprise data architecture, building a multi-camera video intelligence system requires ingesting live streams into an intermediate buffer, writing them to disk, copying chunks over to a GPU cluster for object detection, and then exporting the resulting metadata to a separate database. The latent delay makes real-time response an impossibility.

Now consider the entire architecture as a singular, inline event. Live video feeds and imagery are ingested from cameras across an entire urban infrastructure and hit the storage layer as raw data files. Instantly, the platform's internal broker triggers a serverless function that passes the video clip to a visual AI model. The video is decoded, vectorized, and semantically indexed entirely within the storage environment using native database tables. This is what VAST is getting customers excited about.

The VAST DataEngine demo showed how users can query real-time video monitoring in Manhattan. Golan provided a demo modeling real-time monitoring of February’s severe New York blizzard and queried the system: "Is Manhattan still full of slush?" By combining traditional database column querying filtering - such as looking at specific camera IDs over the last five minutes - with vector similarity search, the engine parsed the visual content of the live video instantly.

Because the video intelligence sits entirely on top of the native file system, users could immediately replay the clip, look at the AI model's text-based structural reasoning, and review historical frames without data ever leaving the physical appliance. The platform ships this architecture as a fully open-sourced blueprint, allowing enterprise architects to adapt it directly to their localized environments.

From Answers to Action…

The next evolution of this architecture emerges when the human is removed from the loop entirely. During the presentation Grimberg also discussed AgentEngine, a framework that allows AI agents to interact directly with the data layer and orchestrate workflows across the platform. In the Manhattan snowstorm example, a user might initially ask, "Is Manhattan still full of slush?" and an AI chatbot/assistant would simply return an answer.

An agentic system can go much further. It can recognize that the user's underlying objective is monitoring changing road conditions, offer to track the situation continuously, and automatically notify the user when conditions improve.

What makes this significant is that the agent is not merely generating text. It is reasoning about available tools, triggering workflows, querying live data sources, invoking AI models, and maintaining awareness of changing conditions over time. The same infrastructure that ingests video streams, executes serverless functions, and performs vector searches becomes the operational substrate for autonomous software agents.

In this model, AI shifts from answering questions to managing processes. The Manhattan query ceases to be a one-time interaction and instead becomes an ongoing workflow in which the agent continuously observes, evaluates, and acts on incoming information. That may be the most important implication of all: the emergence of a data platform that doesn't just serve intelligence, but actively operationalizes it.

… From Data Movement to Decision Making

Back to Golan’s broader argument: As enterprises move from static analytics toward continuous reasoning systems, architectures built around moving data between disconnected silos become increasingly difficult to justify.

The future VAST describes is one where storage is an active participant in computation, capable of triggering workflows, enriching data, and serving as the foundation for real-time AI operations.

That distinction becomes increasingly important as enterprises turn their gaze toward agentic systems. The next generation of enterprise applications will not consist of a handful of users running reports against a database. They will consist of thousands (maybe millions) of agents continuously observing, reasoning, and acting on streams of information.

Every additional hop between storage, databases, message brokers, vector indexes, and AI services introduces delay, complexity, and cost. The organizations that succeed will be those that remove those hops entirely.

Seen through that lens, the Manhattan snowstorm demonstration wasn’t really about video analytics. It was a preview of a broader computing model in which data, databases, event processing, and AI inference converge into a single operational environment. The question being asked - "is Manhattan still full of slush?" - was almost incidental. What mattered was how quickly the system could transform raw information into an actionable answer.

VAST is betting on a future where compute moves to data and the boundary between storage and computation effectively disappears. As enterprises prepare for a world defined by autonomous systems and continuous machine reasoning, the winners may not be those with the most GPUs, but those that can transform information into action with the fewest architectural barriers in between.

The Wall Between Data and Compute Is Coming Down

The Rise of the Living Data Layer

Real-Time Vision: The Smart City Blueprint

From Answers to Action…

… From Data Movement to Decision Making

More from this topic