BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
Virtually all the major storage vendors are producing all-flash arrays (AFAs), and customers are buying them in...
record numbers. The applications for all-flash include server virtualization and virtual desktop infrastructure, big data and database applications in general and specialized apps where a high number of IOPS is required. This does not mean that spinning disk is going away -- despite what you might see in ads sponsored by some all-flash storage array vendors, the raw cost of a solid-state drive (SSD) is still much higher than a hard disk drive (HDD). One way to off-set the higher cost is to implement compression.
SSDs are so much faster than HDDs that, with compression and in-line deduplication, they can effectively increase all-flash capacities by two to three times and still provide performance that's better than HDDs. In addition, storage virtualization can use flash storage as either a cache or a tier zero in a tiered environment, yielding much higher overall performance for the storage as a whole, while requiring only 10% to 20% of the overall capacity of the next tier. What this means is 10 terabytes (TB) of all-flash speeds up 50 TB to 100 TB of high-performance HDD, 100 TB of high-performance HDD speeds up 1,000 TB of high-capacity HDD, and 1,000 TB of high-capacity HDD speeds up 10 petabytes (PB) of tape or cloud. The modern data center can have as many as four or five tiers, which might include all-flash, high-performance primary HDD storage, high-capacity HDD object storage, tape and cloud. Using tiering effectively, data can be migrated as needed from high-performance tiers to low-cost tiers and back.
There are relatively few systems that need the high performance of all-flash, especially as the capacities of all-flash go up from dozens of terabytes to several petabytes. While big data can consume lots of capacity and use all the speed of all-flash, server virtualization and virtual desktop infrastructure (VDI) typically use a relatively small amount of storage, especially when deduplication and compression are available. In many data centers, storage administrators will set up all-flash as a tier zero backed by existing tier-one to tier-three HDD-based storage.
The performance of all-flash, combined with the increased capacity from newer flash drives and lower prices, is making all-flash a near-competitor with tier-one, high-performance drives. In fact, 15,000 rpm drives are selling less as flash penetrates the market. However, the performance of flash is not generally needed for serving files and other basic storage tasks -- all-flash can be used in the same way 15,000 rpm drives are used -- to accelerate the lower tiers of storage rather than keeping all the data on the uppermost tier. All-flash can be less expensive on a dollar-per-IOPS basis, and starting prices from all-flash storage array vendors have dropped substantially in recent times. The premium for all-flash is still substantial, especially compared to HDD systems with high-capacity disks. Therefore, it still makes sense to use flash as a tier to accelerate high-capacity storage rather than replacing all the existing HDD storage with flash.
Applications for all-flash
The primary applications that benefit from all-flash storage are those that can take advantage of the high number of IOPS. Very high throughput is also a characteristic of all-flash, but HDD-based storage can generally provide enough throughput to saturate the connections. If an HDD-based array can provide six 10 Gbps connections and run at full speed, going to an all-flash storage array that is also limited to six 10 Gbps connections will not provide any higher throughput. Applications that can use the IOPS include online transaction processing (OLTP), data searches of large data sets, real-time processing of data from the internet of things and virtualization of servers or VDI.
Big data, OLTP and searches are often characterized by very large amounts of data and can use up capacity as well as performance in all-flash. In contrast, especially when coupled with deduplication and compression, VDI and server virtualization often have large amounts of data in common from one virtual machine (VM) to another, and can see greatly increased performance from relatively small amounts of capacity. Fortunately, SSD capacity has been growing by leaps and bounds over the last decade, and systems from all-flash storage array vendors have gone from capacities measured in hundreds of gigabytes to more than a petabyte while costs have plunged in the same period.
VDI and server virtualization have additional characteristics that can stress even all-flash storage. Since there are anywhere from dozens to hundreds of VMs running at once, I/O from the aggregate system tends to be very random (not predictable by algorithm) and may also be subject to I/O storms resulting from simultaneous I/O across multiple VMs. Load-balanced servers in a cluster or many client VMs updating software at the default time in VDI applications can all start using a lot of I/O simultaneously. Storage software can help with these issues by combining I/O across multiple VMs or by caching updates.
Storage virtualization and storage management
The key to getting the most from the expensive array that will arrive from one of the all-flash storage array vendors to your data center is storage virtualization or storage management software which may be part of your overall data storage system. So you may have already virtualized your storage with third-party software -- separate from the storage hardware -- or included it with the all-flash storage array. However, when finished, you'll essentially get the performance of flash for all data in your data center through the magic of caching. If you get an all-flash array, you should use it as a cache in a virtualized storage environment. That's exactly how a lot of all-flash is being used in general data centers. Only big data, or very large virtual server or VDI environments, can make use of both the high IOPS and capacity of all-flash arrays. If you're not using all-flash for very specific things, it's likely going to waste, since serving files to client systems won't show a difference between 0.1 and 0.00001 second response times. Since only 10% to 20% of data is generally active at any given time, you can get the performance of the top tier for all the data in the system in a tiered storage network. This also works for multiple tiers: 100 TB of flash in tier zero, accelerating to 1 PB in tier one, 10 PB in tier two and so on.
The original storage tiers from around 2000 were all HDD-based, with 15,000 rpm, 10,000 rpm and 7,200 rpm drives fitting into tiers one, two and three. When flash came along and showed its performance potential, it was designated tier zero. As needs changed, additional tiers for disaster recovery, backups and archiving, as well as object storage, were added at the back end. Once the tiers are all integrated into a storage management system, even data archived in low-cost storage can generally be retrieved and brought online in relatively short order.
Uses for AFAs as primary storage are relatively limited. There aren't that many applications that will show increased performance running on all-flash as opposed to a good hybrid system, or even high-capacity NAS. Big data, OLTP, databases, server virtualization and VDI can use the speed and throughput of all-flash, but many other apps will show little gain from the increased performance. Adding flash as a tier zero is a practical use for the performance of all-flash that can provide most of the benefit of all-flash while allowing administrators to avoid forklift upgrades and to continue to use their existing HDD storage.
As cloud storage continues to drop in price and the ease of moving both server workloads and data to the cloud continues to improve, the data center may become moot. By the time all-flash has dropped enough in price to make it worthwhile to replace existing high-capacity tiers of HDD storage, many administrators may have chosen to migrate to the cloud instead.
AFA technology end game
AFAs from all-flash array storage vendors won't replace HDD storage in most applications for years, but it is currently the fastest affordable storage, with capacities and performance continuing to improve rapidly. With good tiering software and storage management systems, data center storage systems can use all-flash as a front end for a system with a much higher capacity and still reap the benefits of high performance.
Eventually, when costs have dropped enough to make HDDs obsolete, the same storage management systems will make it easy for storage administrators to migrate data off their old systems and onto SSDs.
Handbook: Points of comparison when assessing AFA vendors
Evaluate all-flash array features before buying
Pick an all-flash storage array that meets your needs