It all started with a seemingly simple question…
‘What else would you like from flash storage that you’re not getting today?’
When we started the company, we had the fortune to talk with, interview and collaborate with 100s of customers. Customers of all shapes and sizes – from classic enterprise customers to hyperscale cloud providers to some of the leading computational research and AI centers in the world. Our background is in flash storage and scalable systems… so ‘what else would you like from flash storage’ seemed like an easy enough question to ask and get answers to. As easy as this sounds, the answer we most commonly heard from customers was not at all what we expected. Unknowingly, that simple question was always viewed through the lens of performance only because that is what customers had always associated flash storage with… Customers surprised us by explaining that they didn’t really need any more performance than what they were getting out of the generation of flash storage systems that was available in 2015. Many customers told us that when they bought flash, they were often buying more performance than what their applications needed because performance is what you get when you buy flash capacity. Tier-1 flash was just ‘fast enough’.
What we did learn is that customers love flash for reasons well beyond performance.
- Flash doesn’t hotspot in the same ways that mechanical media does.
- Flash is not crippled by access concurrency or random vs. sequential workloads in the way HDDs are.
- Flash can be used to apply much more fine grained data reduction than HDDs since you no longer had to worry about the idiosyncrasies of disk physics.
- Flash doesn’t have any moving parts, so drives fail considerably less. Investments in flash can be amortized over a decade.
- Flash is now better for the data center – drive sizes are already at parity with disks and advanced data reduction that’s possible with flash tips the scales away from HDDs.
So – if Flash is so great, why aren’t all data centers built entirely from Flash storage? The problem, up until now, has always been cost. Even with the benefit of more aggressive data reduction, the delta between the cost of flash systems and HDD-based scale-out storage and archives has been too severe to ignore. Today’s enterprise flash storage can cost 10x-20x what people pay today for HDD archives. Therein is why, despite all of today’s storage vendor hyperbole and hand-waiving, nearly 80% of today’s storage shipments continue to be HDD-based systems and why customers continue to create complex tiers of infrastructure in order to balance the performance and capacity tradeoffs forced upon them by legacy Flash and HDD system architectures.
This pyramid of infrastructure became the focal point in our early conversations – and what we realized through 100s of discussions is that customers didn’t want to tier their data and manage all of the operational complexities of dealing with 5, 10 even 20 different types of storage they had to move their data around. Storage tiering is just something that they’d been conditioned to do because of their lack of a better alternative.
Storage may not have been changing, but the applications sure have.
When we turned our attention to what was driving the next wave of industrialization, we started to consider the challenges that are emerging with Artificial Intelligence (AI). AI has quickly graduated from the research lab and is actively being used for everyday practical applications – and while we’re still at the early days of AI, we see evidence that modern tools such as Tensorflow, Pytorch, Scikit and being adopted by not only hyperscale and startup companies, but also by very established organizations.
The challenge with the trend toward big data and machine+deep learning computing is that the I/O model is diametrically opposed to how organizations have been conditioned to store and manage their data for the past few decades. These tools only become more effective and more accurate as they are exposed to larger and larger data sets. On the other hand, if an organizations corpus of data has been exiled to slow archival storage – it will never be accessible for these modern applications to derive business value.
The result: The fourth wave of industrialization (AI) has rendered the classical storage tiering pyramid obsolete
And then, VAST Data…
VAST Data was founded on one idea: customers should no longer be forced to choose between a series of compromised architecture and application outcomes. Flash, at its fundamental level, has broken the long standing tradeoff between performance and capacity – everything scales linearly with flash. Furthermore, new technologies such as persistent memory and data-center scale low-latency storage fabrics can help flash express this linearity while also driving down costs when applied through a new type of storage architecture.
The journey to this concept required a great deal of contrarian thinking and for us to break decades-old conventions. Fortunately, we had the benefit of a new foundation to build upon – new technologies that enabled VAST’s architects to simplify the problem set in order to exponentially expand the gains. The results are very counter-intuitive:
- Flash is the only media that can be used to bring the cost of storage under what people pay today for HDD-based systems.
- NFS and S3 can be used for applications that up until now required a level of performance that could only come from block storage.
- Low-endurance QLC flash stores four bits in each flash cell as 16 different charge, or voltage, levels. Compared to less dense flash technology like MLC (2 bits/cell) and TLC (3 bits/cell) QLC flash has lower cost/bit and lower endurance. can be used for even the most transactional of workloads.
- Storage computing can be disaggregated from storage media to enable greater simplicity than shared-nothing and hyper-converged architectures.
- Data protection codes can reduce overhead to only 2% while enabling levels of resiliency 10 orders of magnitude more than classic RAID.
- Compressed files provide evidence that data can be reduced further when viewed on a global scale.
- Parallel storage architectures can be built without any amount of code parallelism.
- Customers can build shared storage architectures that can compose and assign dedicated performance and security isolation to tenants on the fly.
- One well-engineered, scalable storage system can be ‘universal’ and can enable a diverse array of workloads and requirements.
By focusing not on ultimate performance, but on the cost of infrastructure and the simplicity benefits that are a product of consolidation, we’ve democratized flash for every data center, every application, every user. When we understood that absolute performance is not the ultimate target, we realized that the aggregate flash performance across petabytes to exabytes of resilient, affordable flash capacity will truly enable the modern computing agenda. IOPS and bandwidth are now a byproduct of flash capacity, and everything just becomes ‘VAST Enough’.
Our aim is simple: VAST intends on being an extinction-level event for mechanical media in the enterprise data center and to bring an end to the storage tiering era. The consequence of this vision is not just a simpler life for IT operations, but the elevation of vast reserves of data onto a platform that can be analyzed, mined and industrialized in real-time
A great team off to a record breaking start…
VAST engineers are forcing themselves to shed convention on a daily basis in order to deliver on the promise of the Universal Storage is a single storage system that is fast enough for primary storage, scalable enough for huge datasets and affordable enough to use for the full range of a customers data, thus eliminating the tyranny of tiers. architecture. The team is a collection of exemplary individuals who embody the inventor’s spirit.
We’ve also been fortunate to work with leading investors who share our vision for a simpler data center and unbounded insights. Today we announced not only what our company is doing, but also that we’ve secured $80M of funding backed by Norwest Venture Partners, TPG Growth, Goldman Sachs, Dell Technologies and 83 North. The support and guidance we receive from these top-tier institutions has been instrumental as we reshape the world’s perceptions on how modern infrastructures and applications can be built.
The idea is compelling, the team is in place and the company is now off to a very strong start. Since shipping our first GA product in November of 2018, VAST has managed to sell more than any other IT infrastructure company has sold in their first 90 days. In many cases, we’ve sold more in our first quarter than many of the best infrastructure companies sold in their first year of operation. The fact that many customers from different markets are writing multi-million dollar checks for their first order serves only as a validation of the platform’s resilience and the architecture choices that VAST made in the early days that have resulted in a delightful customer experience and rapid customer uptake.
That said – we still have very much to do and we can’t wait to share our story as it evolves. Stay tuned for a blog series we’ll be publishing called “Breaking Tradeoffs” where we’ll review our approach to specific challenges that the market has faced for decades.
Please reach out if you have questions, comments or just want to learn more. We look forward to learning your application and infrastructure stories and discovering with you how a true all-flash data center can change the way your organization can interact with and extract value from data.