ludodesign - Fotolia


All-flash array comparison: The keys to performance, capacity success

Analyst George Crump provides tips on selecting an all-flash storage array that meets your environment's performance and capacity needs.

One of the challenges all-flash storage array manufacturers face is differentiating their products from those of their competitors. After all, the storage media is essentially the same. Most all-flash arrays use multi-level cell flash.

As a result, all-flash array (AFA) vendors often end up fighting over what are sometimes minor or irrelevant performance differences in their product offerings. This tip offers key things to look for in an all-flash array comparison.

Core features: Thin provisioning, snapshots, cloning

Most all-flash array vendors now provide the basic core features that data centers have come to expect from their storage systems. These features include thin provisioning, snapshots and clones (writable snapshots). On hard drive-based systems, these features had to be applied judiciously as they could impact application performance. For example, thick provisioning of virtual machines used to be considered a best practice for performance-sensitive situations.

All-flash array vendors often end up fighting over what are sometimes minor or irrelevant performance differences in their product offerings.

Flash in general, and all-flash storage arrays in particular, changed all that. These features all place extra write loads on the storage system as data is created or updated. Flash responds more quickly to write I/O than hard disk drives, so these features have less of an impact on performance. Combine this with the fact that AFAs provide far more performance than the typical data center needs, and you end up with a feature set that can be deployed with almost no concern about performance impact.

In general, these core features are now commonplace in AFAs and their use is encouraged. There is nothing in the features themselves that should cause a concern. However, IT planners should spend time understanding how each all-flash array vendor delivers its features.

Integrated features vs. add-on features

All-flash storage array vendors have used two methods to deliver core features to their customers. Some vendors like Pure Storage Inc., SolidFire Inc., EMC XtremIO and even Dell Compellent have essentially written their storage software from scratch and integrated these features into their offerings. In each of these cases, the vendor uses off-the-shelf hardware to install their software. These vendors tend to look at themselves more as software vendors than as hardware vendors.

Vendors like Violin Memory Inc., IBM (Texas Memory) and Tegile Systems Inc. leverage an external source for the delivery of some or all of their data services. Both IBM and Violin use an external hardware appliance, and Tegile leverages a ZFS foundation as its source and then adds features like data deduplication, compression and metadata management on top of that.

The integrated solutions should provide a more seamless look and feel to the system and should do so less expensively. On the flipside, IBM and Violin arrays can be easily set up as a raw flash array with no additional features. For environments with a need for extreme performance (plus 500k IOPS), this can be a very attractive option.

Comparing scale-out and scale-up all-flash arrays

AFA vendors are starkly divided on the value of scaling all-flash arrays. With a scale-up array like those from Tegile, Violin, IBM and Pure Storage, you are essentially buying all the performance potential of the array upfront. The only component that can typically be added is more drive shelves.

The concern with a scale-up system is that the data center will reach the limits of capacity, performance or both and then be required to buy a whole new system. However, even midrange scale-up all-flash arrays provide hundreds of thousands of IOPS. Also, systems typically include deduplication and/or compression, so assuming some measure of storage efficiency gain, they can scale to dozens if not hundreds of terabytes. It is also important to note that an increasing number of scale-up all-flash arrays allow users to replace old controller heads with new ones so that performance and capacity can be scaled in tandem. Regardless, when choosing a scale-up all-flash storage array, it is important to carefully consider whether the system's capacity and performance potential will meet your needs going forward.

Scale-out systems like those from SolidFire, EMC XtremIO and Kaminario Inc. are built from a cluster of servers that host the storage software and are aggregated to provide performance and capacity. Most of these systems have to start with three nodes, which may be overkill for the typical data center. The good news is that XtremIO and Kaminario both allow for a blended model. They can start as a single node scale-up design and then, if performance and/or capacity limits are reached, they can shift to a scale-out mode. SolidFire counteracts this with a very small three-node cluster that will allow the intermixing of high- and low-density node sizes within that same cluster.

The role of high availability

It stands to reason that an all-flash storage array is more than likely a mission-critical system. Consequently, the availability of the AFA is also a critical design feature so most all-flash arrays offer high availability (HA) out-of-the-box. Scale-out systems provide HA via the very nature of their scale-out cluster configuration. If one node fails, the others continue serving data while a redundant node is rebuilt in the background. The other advantage of this approach is that performance in a failed state is only decreased by one minus the number of nodes so, in a larger cluster, the performance impact should be minimal.

Scale-up systems will take one of two approaches: active-active or active-passive. In an active-active approach when both storage controllers are operational, they are both responsible for serving up data. In other words, the performance load is distributed between them. The downside to this approach is that, if there is a failure, the entire I/O load goes through a single controller and, in theory, performance could degrade by 50%.

In an active-passive approach, the second controller is essentially idle waiting for the primary controller to fail. If the primary controller does fail then the workloads are all shifted to the secondary controller. While this does mean that there is idle hardware, it also means performance is very consistent in a failed state.

This brings up an additional advantage of scale-out storage. Some workloads could be configured to survive multiple controller (node) failures, whereas in a scale-up system -- while very unlikely -- a failure of the second controller, before the first one is replaced, would result in application downtime.

Data efficiency vs. pure performance

When comparing all-flash arrays, most of them provide some level of data efficiency. Many vendors, like Pure Storage, Kaminario, Violin, Tegile and XtremIO provide both deduplication and compression, while others, like Hewlett-Packard and IBM, provide one or the other. Most data centers will see some benefit from one or both of these data efficiency features. Virtual environments, desktop and server will benefit from deduplication, whereas database environments will benefit more from compression.

There is also some discussion around the ability to turn these data efficiency features off. Vendors like Kaminario, Violin and IBM can do this. The theory being "why spend the time applying data efficiency to a particular data set if there will be no material benefit?" If the feature can be turned off, this should lead to better performance and low storage system costs. For some data centers, this capability may make a difference. But with others, there will be little difference since the AFA, even with all these features turned on, still provides more performance than they need. In these instances, the idea of "set it up once and forget it" probably becomes more appealing.

Conducting an all-flash array comparison can be a daunting process, but it does not have to be. The key to selecting the best, most affordable system for your environment is to understand how much performance and capacity you need today, while trying to establish a range of how much performance you will need over the next five years or so. Then understand how these systems will scale to meet those requirements.

About the author:
George Crump is president of Storage Switzerland, an IT analyst firm focused on storage and virtualization.

Next Steps

Avoiding hidden gotchas when buying an all-flash array

Is a hybrid or all-flash array better for your environment?

AFA product path: Buy or retrofit?

Dig Deeper on All-flash arrays