Product
Nov 7, 2025

VAST DataEngine: Bringing Compute to Your Data

VAST DataEngine: Bringing Compute to Your Data

Authored by

Andy Pernsteiner, Field CTO

At VAST we take great pride in listening to our customers and building features and functionality that solve real problems for them. Our customers deal with some of the world’s largest data sets and they regularly push the boundaries of what’s possible, pushing VAST to break new ground and deliver new and innovative ways to manage and extract value from data.

One of the key challenges they face is processing data at scale. Whether it’s responding to events in real-time, running scheduled tasks, or leveraging AI capabilities, they need to do this without managing complex infrastructure, in a way that minimizes data movement and duplication. To tackle this, VAST has created DataEngine, a serverless computing platform that brings processing directly to your data.

What’s even more interesting is that this technology allows your users and applications to process data automatically as it arrives, enabling next-generation AI and ML applications to work with data in real-time without the overhead of traditional orchestration systems.

What is VAST DataEngine?

VAST DataEngine is a serverless computing platform integrated into the VAST AI Operating System. It enables you to build, deploy, and scale data processing functions without worrying about infrastructure management. The platform handles the complexity of scheduling, event detection, and resource allocation so you can focus on what matters: your business logic.

At its core, DataEngine provides three powerful building blocks:

  1. Functions - Your application code packaged as containers

  2. Triggers - Event sources that start your functions (schedules, S3 events, etc.)

  3. Pipelines - The orchestration layer that connects triggers to functions with intelligent resource management

Use Cases

Here are a few examples of how customers can leverage VAST DataEngine:

Scheduled Data Processing

Many organizations need to run periodic health checks, generate reports, or perform maintenance tasks on a schedule. Historically, this has required setting up cron jobs, managing compute resources, or building complex orchestration systems. With DataEngine, this becomes trivial.

Imagine if you could:

  1. Write a simple Python function that performs your task

  2. Define a schedule trigger (like "every 1 minute" or cron syntax)

  3. Deploy the function and trigger

  4. The platform handles everything else, including scaling, resource allocation, monitoring, and error handling

Here’s what a simple scheduled function looks like:

Here's what a simple scheduled function looks like:

The function lifecycle is straightforward:

The function lifecycle is straightforward:

In the DataEngine UI, you simply define your trigger with a schedule, connect it to your function, and deploy. The platform handles everything else, including scaling, resource allocation, monitoring, and error handling.

VAST DataEngine: Bringing Compute to Your Data
AI-Powered Event-Driven Processing

Now let’s explore something more sophisticated: automatically processing files as they arrive in your S3-compatible storage and generating AI summaries in real-time.

This is where DataEngine truly shines. When a new file is created in your storage bucket, the system automatically:

  1. Detects the event

  2. Invokes your function

  3. Downloads the file content

  4. Sends it to an AI service for summarization

  5. Returns the results

Imagine if you could:

  1. Upload a file to your S3-compatible storage

  2. Have the system automatically detect the new file

  3. Parse and send text to an LLM for summarization

  4. Input the results to a DataBase table automatically

This entire pipeline runs automatically with no manual intervention needed. As files arrive, they're processed instantly with AI-powered insights.

Here’s how the function works:

Here's how the function works:

The flow looks like this:

VAST DataEngine: Bringing Compute to Your Data

How it Works

Now that you understand how VAST DataEngine can be used, I’ll explain some of the details of what we built, and how it works.

Developer Experience

One of the key design principles behind DataEngine is making it easy for developers to get started. The platform provides a complete CLI toolchain that integrates seamlessly with your development workflow.

Getting started is remarkably simple:

Getting started is remarkably simple:

This creates a scaffolding like so:

This creates a scaffolding like so:

You can build and test functions locally, push container images to registries, and manage your entire data processing infrastructure from the command line.

You can build and test functions locally, push container images to registries, and manage your entire data processing infrastructure from the command line.

From there, configure & deploy your function either using a .yaml file, or via the DataEngine UI

From there, configure & deploy your function either using a .yaml file, or via the DataEngine UI

Wash/Rinse/Repeat with as many functions as you like, and then build a pipeline.

You can either do via the UI…

Wash/Rinse/Repeat with as many functions as you like, and then build a pipeline. You can either do via the UI…

Or from a YAML file, to aid in automation, version control, etc.

Or from a YAML file, to aid in automation, version control, etc.

Once you deploy, its ready to run.  Simple!  DataEngine automatically scales your functions based on demand. Set minimum and maximum concurrency, and it handles the rest. It spins up containers when needed and scales down during quiet periods.

Built-in Observability

Any data processing platform must provide visibility into what’s happening. DataEngine doesn’t just run your code. It provides comprehensive observability out of the box. All telemetry and logs from your functions get streamed into VAST DataBase tables, which are then rendered in the UI as well as queryable from a CLI and API interface.  This means you can debug issues, optimize performance, and understand system behavior without additional instrumentation. What’s even more interesting is that this observability is built into the platform itself, requiring zero administration or setup.

Pipeline Logs:

Pipeline Logs:

Pipeline Traces:

Pipeline Traces:
Enterprise-Grade Integration

Keep in mind that the DataEngine leverages all of the other components of the VAST AI OS.

  • The VAST DataStore exposes an S3-compatible object interface, which means your data processing functions can seamlessly access your data lake.

  • The VAST DataBase provides a scalable, performant transactional and analytical store for apps to write results, labels, and other metadata to.  It even supports storing and retreival of vector embeddings, enabling scalable Enterprise RAG.

Real-World Applications

The scenarios we explored are just the beginning. Here’s what organizations are building with DataEngine:

  • Data Ingestion Pipelines - Automatically process and validate incoming data files

  • AI/ML Workflows - Transform data, generate embeddings, and run inference pipelines

  • Content Processing - Extract text, images, and metadata from uploaded files

  • Data Transformation - Convert formats, enrich data, and prepare it for analytics

  • Monitoring and Alerting - Periodic health checks and automated response systems

  • ETL Workflows - Extract, transform, and load data between systems

Build, Deploy, and Scale Data Functions Without Managing Infrastructure

VAST DataEngine is yet another concrete example of how VAST is a differentiated data platform. We can’t wait to see how our customers leverage it in their workflows.

The platform provides everything you need to go from zero to production-ready data processing functions in minutes. Whether you’re building scheduled batch jobs, real-time event processing systems, or AI-powered data pipelines, VAST DataEngine provides the serverless foundation that lets you focus on solving business problems rather than managing infrastructure.

More from this topic

Learn what VAST can do for you

Sign up for our newsletter and learn more about VAST or request a demo and see for yourself.

* Required field.