kovaleff - Fotolia
- Mike Matchett, Small World Big Data
Everyone is now onboard with flash. All the key storage vendors have at least announced entry into the all-flash storage array market, with most having offered hybrids -- solid-state drive-pumped traditional arrays -- for years. As silicon storage gets cheaper and denser, it seems inevitable that data centers will migrate from spinning disks to "faster, better and cheaper" options, with non-volatile memory poised to be the long-term winner.
But the storage skirmish today seems to be heading toward the total cost of ownership end of things, where two key questions must be answered:
- How much performance is needed, and how many workloads in the data center have data with varying quality of service (QoS) requirements or data that ages out?
- Are hybrid arrays a better choice to handle mixed workloads through advanced QoS and auto-tiering features?
All-flash proponents argue that cost and capacity will continue to drop for flash compared to hard disk drives (HDDs), and that no workload is left wanting with the ability of all-flash to service all I/Os at top performance. Yet we see a new category of hybrids on the market that are designed for flash-level performance and then fold in multiple tiers of colder storage. The argument there is that data isn't all the same and its value changes over its lifetime. Why store older, un-accessed data on a top tier when there are cheaper, capacity-oriented tiers available?
There's hybrid and then there's hybrid
It's misleading to lump together hybrids that are traditional arrays with solid-state drives (SSDs) added and the new hybrids that might be one step evolved past all-flash arrays. And it can get even more confusing when the old arrays get stuffed with nothing but flash and are positioned as all-flash products. To differentiate, some industry wags like to use the term "flash-first" to describe newer-generation products purpose-built for flash speeds. That still could cause some confusion when considering both hybrids and all-flash designs. It may be more accurate to call the flash-first hybrids "flash-converged." By being flash-converged, you can expect to buy one of these new hybrids with nothing but flash inside and get all-flash performance.
We aren't totally convinced that the future data center will have just a two-tier system with flash on top backed by tape (or a remote cold cloud), but a "hot-cold storage" future is entirely possible as intermediate tiers of storage get, well, dis-intermediated. We've all predicted the demise of 15K HDDs for a while; can all the other HDDs be far behind as QoS controls get more sophisticated in handling the automatic mixing of hot and cold to create any temperature storage you might need?
Whither traditional storage?
This brings us to the issue of what traditional storage really is these days. Everyone compares their shiny new products to so-called traditional storage, yet we think that what once was traditional storage has changed significantly and is about to change even more. One of the biggest changes isn't necessarily due to the ability to drop in SSDs in place of HDDs, but rather being able to use the growing bounty of computing power available.
CPU chip capabilities continue to advance as fast as flash. More built-in processing power, like more cores supporting more threads, faster execution pipelines, and upcoming features like chip-level encryption support mean more inline and online storage features can be delivered in software. For example, it's now possible for most storage vendors to provide inline deduplication based on software processing.
This has led to the rise of software-defined storage (SDS), which is really an acknowledgement that all of what a storage array does can be run as a program and, in turn, be dynamically programmable. While some vendors still productively leverage custom ASICs (HP 3PAR, SimpliVity), many array controllers have long been mostly software. While SDS vendors sell just the software part and leave the infrastructure up to users, many SDS purchasers still end up buying a pre-loaded SDS appliance that doesn't look much different from "traditional" storage when it comes off the pallet.
Still, ambitious new SDS providers have brought some benefits to the storage market. We see improvements offered in QoS at fine-grained levels, dynamic online configuration and partitioning, inline storage features and broader capacity efficiencies. SDS can enable faster refresh cycles to accommodate new technologies and provide increasingly intelligent storage-side analytics.
Tiering isn't a bad word
This brings me back to the key hybrid feature of auto-tiering. Tiering is evolving quickly from being based on relatively simple data aging or recent access algorithms working with large chunks of data, to being based on fine-grained, small chunk analyses of access and user/usage patterns over varying time intervals, the stated or required QoS of the data, competing workloads and the increasingly dynamic makeup of available storage resources.
All-flash proponents might talk about how they're becoming cost-efficient (per capacity) enough to handle more mixed workloads with differing requirements. At the same time, flash-converged hybrids are getting better at delivering targeted QoS, including pinning top-end workloads in flash. The all-flash array market gang counters with how any effort put into determining QoS is a waste of Opex when every workload can get consistent flash performance. Still, a large percentage of data quickly moves down the value chain, with much data never or almost never getting accessed after a short active lifetime.
We're definitely approaching a watershed moment in storage. Big vendors like EMC, Hewlett-Packard, IBM and NetApp have hedged their bets with traditional hybrid, all-flash and flash-converged hybrid options, while smaller players like Kaminario, Nimble Storage, Pure Storage and Violin Memory each promote a specific vision. Either way, we suspect the future "traditional" storage array will provide for a wide range of workloads without much manual storage administration, and the lowest TCO option will eventually dominate.
Mike Matchett is a senior analyst and consultant at Taneja Group.
Storage infrastructure tops list of big data issues
The challenge of storage infrastructure management