Serverless Python Functions

DataEngine enables custom, lightweight, serverless Python functions that execute directly where your data lives, running on VAST Native Compute in stateless containers on VAST Compute nodes, called CNodes. Replace batch jobs and ETL scripts with custom Python code for real-time, event-powered processing.

Native Query Engine Reimagined for Today’s Data and AI Workloads

Run analytics where data lives. The VAST Native Query Engine unifies vector similarity search, SQL aggregation, and hybrid queries in a single execution framework, simplifying architecture and accelerating insights across AI and data workloads.

In-Place Compute Execution

Run serverless Python functions directly on your data with VAST DataEngine. Build and orchestrate data and AI pipelines in code or visually, without managing infrastructure or moving data. Execute transformations, enrichment, and inference in place, with native compute and real-time, event-driven scaling.

Run Python Functions Directly on Data in the Platform

VAST serverless Python functions execute directly on data stored in VAST, eliminating the need to move data to external compute systems. Functions operate on files, objects, structured data, and vectors in place, using the same platform for both storage and execution. This approach reduces latency, avoids data duplication, and simplifies data workflows. By bringing compute to the data, VAST enables real-time processing and transformation without requiring separate infrastructure or complex data movement across systems.

Reduce Data Movement and External Pipelines

Traditional architectures rely on pipelines to move and prepare data for processing, introducing latency and operational overhead. VAST removes this dependency by executing functions directly within the platform, allowing data transformation, enrichment, and processing to occur in place. This reduces data movement and external data pipelines. Organizations can shift to continuous, real-time execution without maintaining separate data pipelines or synchronization processes.

Serverless Python, Without Constraints

Author Fully Custom Python Logic for Any AI and Data Workflow

Users can write custom Python functions to perform any data operation, from normalization and transformation to orchestration and AI processing. There are no predefined constraints on logic or workflows: if it can be expressed in Python, it can be executed within the platform. Functions can integrate with external services, call models, process events, or coordinate multi-step workflows. This flexibility gives developers full control over how data is handled, without requiring external compute systems or specialized processing frameworks.

Simplify Infrastructure and Cluster Management

Serverless Python functions in VAST DataEngine run on demand, eliminating the need to provision or manage Kubernetes or separate compute infrastructure. Fully managed by the platform, they reduce operational complexity so developers can focus on building AI and data workflows instead of maintaining systems. Functions are event-driven and automatically triggered by data changes, streams, or schedules, enabling real-time execution without manual intervention. This simplifies development and removes the overhead of managing distributed environments.

Execute Event-Driven, Auto-Scaling Workloads at Scale

Functions are triggered by events such as new data arrival, event streams, API calls, or scheduled jobs, enabling real-time and asynchronous processing. The platform automatically scales execution based on workload demand, supporting high levels of parallelism without manual configuration. Each execution is stateless, with state managed in the underlying data platform, ensuring consistent and reliable behavior. This allows functions to handle both bursty and continuous workloads efficiently while maintaining predictable performance.

Integrated Data and Compute Platform

Build, Orchestrate, and Monitor Data Pipelines

Serverless functions can be composed into pipelines that orchestrate end-to-end data workflows, from ingestion and transformation to enrichment and AI processing. Pipelines can be built programmatically or visually using the VAST DataEngine pipeline builder, where users define workflows through triggers, functions, and execution steps. Functions can trigger other functions, respond to events, and coordinate multi-step execution without external orchestration tools. Users have access to comprehensive logs and traces for full pipeline monitoring and observability. This allows teams to define and manage complete data pipelines directly within the platform, using either code or a low-code interface.

Powered by VAST Native Compute

Python function execution is powered by internal Kubernetes clusters running on VAST Native Compute directly on VAST CNodes (compute nodes).  Functions are  executed on the VAST DataEngine, running natively integrated on VAST Native Compute. This enables serverless function orchestration, including scheduling, fault tolerance, and scalability. By embedding orchestration within the platform, VAST delivers a fully managed compute experience as part of the data system.

Automatically Balance Resources Between Data and Compute

The VAST DataEngine dynamically allocates resources between data services and compute workloads based on demand and policy. Compute capacity scales up to support function execution and returns to data processing as workloads shift, maximizing overall system utilization. This unified resource model removes the need to provision separate infrastructure for compute and storage while maintaining consistent performance across both. By managing compute and data within a single system, VAST simplifies operations and improves efficiency at scale.

Secure Multi-Tenancy and Hybrid Compute

Secure Multi-Tenancy

Every tenant in VAST is fully isolated and can be provisioned with logical isolation through dedicated namespaces or physical isolation with a dedicated cluster, all managed from a single control plane. Network policies enforce strict separation by preventing cross-tenant traffic at the infrastructure level, ensuring workloads remain secure even in densely shared environments. Administrators gain extensive observability into every layer of the system, with per-tenant, per-namespace, and per-function or pipeline visibility into activity, performance, and resource usage. This combination of flexible isolation models and deep operational insight enables secure multi-tenancy at scale, whether teams need shared efficiency or strict separation.

Unify Internal and External Compute in One Experience

Tenants can be provisioned with one or more internal compute clusters in VAST and can also link external compute clusters, all accessed through the same control plane and developer experience. Pipelines running on internal and external compute can trigger one another, giving users the flexibility to run each workload where it makes the most sense based on cost, performance, data locality, or specialized hardware requirements. This hybrid execution model lets teams mix compute environments without fragmenting their tooling, workflows, or governance. VAST unifies internal and external compute into a single, consistent experience for both developers and operators.