Product

Jun 9, 2026

Splunk SmartStore Optimized by VAST Data’s Infinite Cache

Authored by

Vaughn Stewart, VP Systems Engineering

Organizations rely on Splunk to collect, retain, and analyze ever-growing volumes of machine data, making scalable, resilient storage a critical part of the architecture. VAST Data has now been validated for Splunk SmartStore deployments, including both single-site and multi-site high-availability clusters, giving customers confidence that VAST’s S3-compatible object storage can support SmartStore’s demanding requirements for performance, scale, and data integrity.

The test suite is straight forward; it validates the interoperability of a vendor’s S3 implementation with SmartStore. The tests focus on I/O performance and data integrity at scale while data ingest and searches are executed across a set of environments in normal, degraded, and recovered operational states.

I’m pleased that VAST was able to earn the validation status, but there’s much more that resulted from this effort.

The Validation Effort Resulted in a New Deployment Model

The validation process revealed something we at VAST hadn’t fully anticipated: a deployment model so architecturally distinct from anything else on the market. One that increases observability performance, expands search capabilities, and reduces Splunk’s total storage capacity requirements by two-thirds or more. The results are so compelling that we gave it a name.

I’d like to introduce you to Splunk Infinite Cache by VAST Data.

Executing the Validation “The VAST Way”

First-principles thinking is a founding discipline at VAST. Before we began the validation process, we did something that most storage vendors don’t: we set aside the reference architecture and started from a blank sheet of paper. We studied how SmartStore was designed - its core assumptions, its architectural decisions - and then asked what becomes possible when the underlying storage removes those constraints entirely.

Fundamentally, SmartStore implements a tiered storage architecture for storing and searching buckets (aka data). Each Indexer server has a local cache, containing hot buckets and their replicas, and cached warm buckets. The cached warm buckets are typically comprised of recently indexed (aka rolled) buckets or recently retrieved buckets.

The key to understand is how searches are executed with this storage architecture:

Splunk buckets are comprised primarily of two large files: the timeseries index (*.tsidx) and the journal (*.gz), a compressed bundle of event log files. There’s also metadata, but it’s inconsequential to storage sizing.
All Splunk buckets are searchable; however, search execution is limited to buckets residing in the Indexer server’s local cache.
When remote buckets are required for a search, the search execution must wait while remote buckets are downloaded from S3 to local cache.
All the data in the local cache is replicated – either cache to cache (hot) or cache to S3 (warm).
The capacity of the local cache directly correlates to search performance. Larger caches increase performance by increasing cache hit rates. Storing more data for longer durations, reduces the volume and frequency of downloading remote buckets.

We engaged some of Splunk’s largest customers to pressure-test our thinking. What we heard was unambiguous: customers focus on cache capacity over S3 storage due to the impact on search performance. These customers had deployed everything: local NVMe SSDs and enterprise all-flash SANs for the local cache, and every S3 platform on the market, from enterprise to open source.

Interestingly, all shared a common challenge: when the local cache gets exhausted -and it does - the slow performance of ‘cost-optimized’ S3 platforms brings searches to a standstill.

VAST doesn’t change the I/O behavior of Splunk. We optimize the underlying infrastructure to accelerate it.

Infinite Cache Starts with Data Reduction

Infinite Cache is a deployment model for SmartStore, one that is categorically unique in the industry and made possible only by the capabilities of VAST’s Disaggregated and Shared-Everything (DASE) architecture. No other storage platform can deliver what the VAST DataStore delivers here through the power of DASE.

VAST collapses the need for multiple storage technologies by delivering both low-latency block and high-performance S3 storage within a single data platform. VAST’s similarity-based data reduction and compression shrinks the capacity required by every Splunk bucket, and data deduplication dramatically reduces capacity requirements for bucket replicas, whether those are hot buckets replicas in local cache or warm bucket replicas residing in the local cache and S3.

The result of provisioning all Indexer cluster storage from a single VAST platform - and reducing data across both block and object tiers simultaneously - is something no other vendor can claim: the marginal cost of local cache capacity approaches zero.

As an example: For SmartStore deployments with 120 days of local cache capacity and 1 year of data retention, we see 3:1 data reduction for the entire Indexer cluster, not just S3.

That’s 2.5x better data reduction than what we observed with alternative storage options (enterprise storage, local SSDs, or open-source software). And here’s what makes it remarkable: whether a customer deploys a cache holding 120 days or multiple years of Splunk buckets, the storage footprint of the underlying VAST cluster does not change.

Allow me to quantify the economic impact: we replaced 18 storage platforms – 9 for block and 9 for S3 – with a single VAST cluster that fit in 18 rack units.

Cache Scale + Parallel I/O = Unprecedented Performance

Indexer clusters range in size from tens to thousands of servers. At that scale, the aggregated I/O demands are extraordinary. Reducing remote bucket downloads matters—but eliminating cache exhaustion is only half the equation. The other half is whether the underlying storage architecture can support the performance requirements of searches.

While there are other storage arrays that could reduce data between block and object, they lack the I/O support both the block and S3 storage functions.

This is where DASE is unrivaled. The architecture delivers parallel I/O that can scale linearly whether your throughput requirement is GBs or tens of TBs per second. Each CNode container independently addresses every SSD in the cluster as if it were local. There is no East/West traffic, no I/O bottlenecks due to controller limits, and no diminishing returns on performance as the cluster scales - as one finds with shared-nothing architectures.

Massive, near-zero-cost local cache. Embarrassingly parallel I/O. The result: every class of Splunk search -near-term, ad-hoc, and deep historical - runs faster, at scale, without compromise.

Validated at Scale: The Numbers

A single VAST cluster (10 CNodes and 4 DBoxes) replaced the storage infrastructure for an Indexer cluster that previously required 9 enterprise all-flash arrays for local cache and 9 open-source S3 platforms. What follows is what we observed.

The telemetry tells the story. Steady-state block I/O holds at approximately 10 GB/s of reads (red) and writes (green), representing continuous log ingestion and bucket indexing. During search execution, that figure spikes to approximately 60 GB/s.

And the yellow bars? Near zero. Almost no S3 downloads from the remote object store because Infinite Cache means buckets are already where Splunk expects them. Where S3 transfers do occur the transfer rate is exceptional, a reflection of VAST’s object performance - not a limitation of it.

This data validates the value of Infinite Cache, and it also reveals headroom. It should be pointed out that the environment is currently network bound. The VAST cluster can support ~3X additional bandwidth before it would need to be expanded.

Infinite Cache Simplifies Sizing and Lifecycle Management

With Infinite Cache, the cache is no longer a capacity constraint to be engineered around; now, it’s an elastic resource that grows with demand. Deployments on VAST scale on demand, as search requirements dictate, whether Indexers run on bare metal, virtual machines, or containers. Customers stop sizing for worst-case and start planning for what they actually need.

VAST’s Synchronous Replication Eliminates Incomplete Search Results

Multi-site HA SmartStore clusters rely on the S3 object stores to bidirectionally replicate data between the S3 platforms at each site. Due to the replication lag inherent in ‘eventually consistent’ S3 replication, Splunk searches can encounter incomplete search errors.

VAST clusters implement strictly consistent, bidirectional S3 bucket replication. Each cluster maintains a complete and current set of Splunk buckets, which means Splunk searches are free of incomplete search errors. Always.

Uploading a bucket to a VAST cluster does incur increased latency compared to an eventually consistent S3 platform because the upload must be acknowledged by both the local and remote VAST cluster before completing. I want to address that directly: this increased upload latency has no impact on Splunk indexing or search performance. The tradeoff is strict consistency for zero incomplete search errors. That’s not a limitation. That’s the point.

One should note, multi-site HA SmartStore clusters rely on S3 to replicate data. This results in replicated buckets not in the local cache of the remote Indexers. This is an area where the parallel I/O of VAST dramatically reduces the impact of downloading remote buckets.

This Is Only the Beginning

The timing of this validation matters. Organizations are scaling observability to support agentic AI rollouts - and doing so against the backdrop of a global hardware supply chain shortage. Every platform decision right now carries consequences measured in months, not weeks. If you’re a Splunk Enterprise customer, the question isn’t whether to consider VAST - it’s how quickly you can get there.

SmartStore validation is the foundation, not the ceiling. We are actively working to enable Splunk search head clusters to directly query the VAST DataBase, bringing structured, high-performance analytics to bear on the same data Splunk already indexes. When that work is complete, the architecture we’re describing today will look modest by comparison. We genuinely can’t wait to share it.

To the Splunk partner and engineering teams: thank you. You were exceptional partners throughout a validation process that kept expanding in scope - because the architecture kept surprising us with what was possible. To the customers who deployed SmartStore on VAST before the validation was complete: your confidence in us meant everything. And to the channel, GSI partners, and the sales teams at Cisco and VAST: go show customers how we can help them scale.

Ready to eliminate the tradeoffs between search performance, data retention, and cost?

Join our webinar on June 11 and 12 to see how a modern Splunk architecture can unify your data on a single stack while delivering faster insights, simpler operations, and lower TCO.