Navigating the all-flash array storage buying process
A collection of articles that takes you from defining technology needs to purchasing options
All-flash storage systems consisting of solid-state drives eliminate spinning disk and significantly improve performance....
While flash drives first showed up in storage systems alongside HDDs in hybrid configurations, all the major storage vendors now sell arrays with only SSDs.
All-flash array (AFA) storage remains more expensive when compared with HDDs on a dollar-per-gigabyte basis, but organizations can see TCO advantages from reduced power and cooling and increased performance. And while flash will likely always cost more than spinning disk, the price of flash keeps dropping as it gains in popularity.
While AFA storage was initially used to store critical business applications, today, it is being used by many companies to store all of their active workloads.
The pros and cons of implementing all-flash array storage
The main advantage of AFA storage systems over arrays with HDDs is performance. SSDs are much faster for random reads and writes, since there is no need to move read/write heads, as is necessary with HDDs. SSDs also offer higher throughput and quicker response times than HDDs.
Also, because there are no moving parts to wear out or fail, flash is typically more reliable than HDDs. Flash systems also require less power and cooling, which can be in short supply in many data centers. Flash systems have essentially the same read time for every bit stored in the system. Because there is no wait for a physical head to move to access the next bit of data, the minimum, maximum and average read times are often quite close together.
Write times can be more variable. Unlike HDDs, which write each byte separately, SSDs must erase and rewrite by the block, which can be anywhere from 512 bytes to 4 MB.
The performance of flash over HDDs is so much better that SSD vendors have been able to implement deduplication for active storage. Deduplication algorithms look for occasions where the same block of data is stored in multiple places. They identify the duplicates, delete them and leave a placeholder to point to the original block. This processing takes a fair amount of power, and results in many accesses to storage, both to search for duplicates and to deal with changes to data that has already been written.
Originally, deduplication was used in backups and archives, finding duplicate blocks of data and only keeping one block. As all-flash arrays have increased performance, the dedupe process has moved from offline to online storage, working in combination with data compression to provide an effective capacity several times higher than the actual capacity.
However, all-flash enterprise storage has its disadvantages, too. First, it is more expensive than spinning disk. Another downside is that, until recently, SSDs held less capacity than HDDs. Both of these situations have improved, as the price of flash is falling, and SSDs often become price-effective when the cost of power and cooling is included.
Larger capacity SSDs are showing up, and will soon surpass HDDs for capacity. The first 15 TB SSDs were available in storage arrays in 2016, and much larger drives are on vendor roadmaps.
How SSD drive types have evolved
The original flash devices used cells to store data, and were named according to the number of bytes they stored: single-level cell (SLC), multi-level cell (MLC) and triple-level cell (TLC). The cells stored one, two and three bytes respectively, offering greater capacities in the same space.
The next big evolution was 3D architectures, which stacked multiple layers of cells on top of each other, greatly increasing capacity and reducing cost.
One issue designers have been fighting with SSDs since the beginning is the limited lifespan of cells. Each bit on a hard drive can be written or rewritten an unlimited number of times. In contrast, the cells that make up flash can only be written to a limited number of times.
SLCs typically have longer lifespans than MLCs or TLCs. This is another big impetus behind the need to avoid write amplification; in addition to affecting performance, this can reduce the lifespan of drives.
In addition to the types of cells inside flash drives, there are other classifications for SSDs, such as the type of protocol, interface and form factor. A protocol runs over an interface, and while they may be referred to interchangeably, they are different. For example, it is possible to run the SCSI protocol over a SCSI interface, but also over Ethernet (iSCSI).
The first flash drives used the same protocols developed for HDDs: Advanced Host Controller Interface running over the SCSI, SATA and Fibre Channel (FC) interfaces. Because this protocol was developed with HDDs in mind, they were not as efficient as those with flash, so the storage industry developed a protocol specifically for flash: nonvolatile memory express (NVMe).
NVMe can run over PCI Express, which is the same interface used for add-on cards on PC motherboards. It can also run over fabrics, Ethernet, FC or InfiniBand.
The benefits of NVMe
Demartek founder Dennis Martin examines how IT pros can become familiar with the NVMe protocol for solid-state storage.
Form factors have decreased in size over time, from the same 3.5 inch and 2.5 inch sizes used for HDDs, to 1.8 inch and M.2 sizes designed for use in laptops, tablets and phones.
NVMe drives were included in a handful of storage arrays in 2016, and they will likely appear in more in 2017, before becoming commonplace in 2018.
Vendors often talk about effective capacity for all-flash arrays, which measures the amount of capacity customers can expect after taking data reduction into account. This helps reduce the cost premium between flash and spinning disk drives. For example, a 1 TB HDD might be $70, while a 1 TB SSD can be $500 or more. However, this is less of an issue than it used to be, as most all-flash systems deliver reasonable, effective capacities with performance that HDDs cannot match.
Still, you should always know the actual capacity of the arrays you purchase, and keep in mind that the results of data reduction vary according to the types of data and other factors.
How much speed can you afford?
From the first consumer-grade SSDs to the latest NVMe drives, performance has improved by a factor of 10 or more. That is true whether it is the drives measured in IOPS, throughput measured in megabytes per second or latency measured in milliseconds or microseconds.
The newer and faster the technology, the more expensive it is. In addition, enterprise-class storage systems have a wide variety of technologies designed to optimize or improve performance, from hundreds of gigabytes of RAM added to system controllers to custom code designed to reduce write amplification or to streamline input/output operations.
What all this means is that a 50 TB all-flash SAN system might be an SMB-oriented system that could support a few servers for $25,000, or a monster, enterprise-grade cluster of storage capable of supporting tens of thousands of users and costing $1 million or more.
Making an informed buying decision is made more difficult with hype amplification. A vendor's claim of 1 million IOPS may be true under certain conditions, but connect one server to one array and you might never see 100,000 IOPS. You would need a clustered architecture with multiple servers accessing the data over many different connections to achieve that performance.
The AFA storage market
The leading all-flash array vendors moved into the market in different ways. Traditional storage vendors moved into flash as tier 0 or cache, then into hybrid flash/HDD systems and, finally, into all-flash. These include Dell EMC, Hewlett Packard Enterprise, Hitachi Data Systems, IBM and NetApp.
Another group of vendors started with all-flash or RAM-based systems and have little or no HDD-based storage in their product lines. These include Kaminario and Pure Storage.
Others, such as Nimble Storage and Tegile Systems, started with disk, but their controllers were optimized at the start to get the most out of flash. They call their platforms "flash-first" systems.
Several of the bigger vendors in the first group have both legacy platforms converted from disk to flash, and others that came from acquisitions of all-flash startups. For example, EMC acquired XtremIO, IBM bought Texas Memory Systems and NetApp acquired SolidFire.
Proving claims of AFA price parity
How businesses can determine if they need AFA storage
Seven all-flash products that made our list of 2016 storage product finalists