product
May 29, 2025

Mixing CNodes and EBoxes to Power the Future of AI Infrastructure

Mixing CNodes and EBoxes to Power the  Future of AI Infrastructure

Author

Howard Marks, Technologist Extraordinary and Plenipotentiary

When we first introduced DASE as the architecture behind what we then called Universal Storage, we talked about asymmetric scaling through the lens of a storage system; DBoxes provided capacity, and CNodes provided performance. HPC centers typically ran more CNodes per DBox, or per petabyte, than customers using VAST systems to hold and protect their backup data, but that CNode/DBox ratio varied across a rather narrow range.

As we adapted DASE to use standard servers as EBoxes, we dedicated our efforts to bringing the performance and reliability that VAST had built our reputation on to this new architecture. Consequently, we limited support to clusters made up exclusively of EBoxes. At the same time, we expanded the functionality of the VAST AI Operating System to provide a much broader and, in many cases, more compute-intensive set of data services.

The VAST cluster at the center of an AI training or inferencing workload will run a Kafka API-compatible event broker, SQL Query engine, database manager, and a serverless function execution engine, each of which would typically require a separate cluster of servers or containers without VAST. All those services need more computing power than the CPUs built into a cluster of EBoxes can provide.

The answer for typical shared-nothing solutions leveraged by other vendors would be to force the customer to build their cluster from nodes with fewer SSDs or with a new node model with more CPU cores and DRAM so the cluster has enough computing power to run all the services the customer wants. The problem with this solution is that all the nodes in a shared-nothing storage pool have to be identical. Therefore, when a customer decides they want to run the SQL query engine or event broker, they have to replace the whole cluster to get the additional computing power those new services require. This should all sound familiar to shared-nothing storage customers who’ve had to upgrade their nodes when CPU-intensive features like encryption at rest or data compression were added by their vendors.

While EBoxes package CNodes and DNodes into the same physical box, VAST clusters aren’t your father’s shared-nothing architecture. Shared-nothing architectures place each node in exclusive control of its SSDs, requiring complex communications between nodes to coordinate reads and writes. The DNode containers in each EBox connect the SSDs in that EBox to the cluster’s NVMe fabric the same way DNodes in DBoxes do. Furthermore, the CNodes in the cluster all mount all of those SSDs at boot time so every CNode in the cluster can directly access every SSD in the cluster using NVMe over Fabrics.

Mixing CNodes and EBoxes to Power the Future of AI Infrastructure

An EBox runs CNode and DNode containers

Just as no CNode owns any of the SSDs in a VAST cluster with disaggrigated CNodes and DBoxes, the CNode container in any given EBox does not own the SSDs in that particular EBox. Regardless of their physical location, all the SSDs are equally accessible by all the CNode containers in the cluster.

When a DASE cluster needs more computing power to support the bandwidth demanded by some GPU servers or to run services like event brokers and user functions, the answer is to add more CNodes. This has always been true for clusters running discrete CNodes and DBoxes, and with the latest release of the VAST AI Operating System, it is now also true for EBox clusters, too.

In a mixed CNode and EBox cluster, since every CNode is a peer with access to every SSD in the cluster, every CNode can provide any of the services the cluster offers. Additionally, every CNode also performs housekeeping tasks like migrating data from the SCM write buffer to the capacity flash SSDs and rebuilding erasure codes when SSDs and/or EBoxes go offline.

Mixing CNodes and EBoxes to Power the Future of AI Infrastructure

As you can see from the diagram above, the CNode pools for data services like VAST’s query engine can include both standalone CNodes and the CNodes within the EBoxes in a cluster.

As we add more compute driven services including analytics, automation, application runtimes and the ability to manage and execute user functions to the VAST AI Operating System, the flexibility that DASE’s support for asymmetrical cluster scaling really comes in handy. With DASE, customers are no longer bound to the set of data services their cluster’s nodes can support; they can simply add CNodes to support new services, regardless of whether their VAST cluster uses DBoxes or EBoxes to hold its SSDs.

This approach of mixing CNodes and EBoxes directly addresses a critical challenge for customers: the need to adapt and scale their data infrastructure without costly and disruptive hardware overhauls. Customers care deeply about this because it offers unprecedented simplicity, flexibility and cost-efficiency. They are no longer locked into the initial compute capacity of their EBox based solution.

Instead, they can seamlessly add CNodes to provide the necessary processing power for emerging AI, analytics, and other demanding applications as their needs evolve. This means they can protect their initial investment, avoid the “rip and replace” cycle common with traditional shared-nothing architectures, and confidently embrace new data services, knowing their VAST AI Operating System can scale compute and capacity independently and economically.

More from this topic

Learn what VAST can do for you
Sign up for our newsletter and learn more about VAST or request a demo and see for yourself.

By proceeding you agree to the VAST Data Privacy Policy, and you consent to receive marketing communications. *Required field.