Solid-state storage vendors are touting "million IOPS" performance numbers. Here's what you need to know to see if those SSD performance benchmarks ring true.
It seems as if each week brings a new performance claim of 1 million IOPS from a storage vendor touting its solid-state storage system. Due to a lack of disclosure with some of these published solid-state drive (SSD) performance benchmarks, it can be tough for a user to understand what these million IOPS performance numbers mean and how they relate to current storage performance issues.
Data storage vendors go to great lengths to demonstrate their products' high performance and to prove to storage managers that their products can handle large volumes of data center activity. Solid-state vendors started publishing SSD performance benchmarks to demonstrate how a 1U or 2U solid-state storage device could outperform a large enterprise-class storage system tricked out with thousands of drives. The vendors also wanted to demonstrate that they could not only achieve 1 million IOPS in such a small, efficient footprint, but they could do it at a fraction of the cost of a high-end storage array.
But the playing field is far from level; for example, high-end storage systems boast data protection and data management facilities that a majority of solid-state storage offerings can't match. Many storage vendors, including those selling high-end enterprise storage, are racing to close that gap to give users the best of all worlds: performance, price, data protection and management. Recently, they've reached 1 million IOPS with enterprise solutions that include features like thin provisioning, snapshots, replication, infrastructure management and monitoring, and API support for server virtualization interfaces like VMware's vStorage APIs for Array Integration (VAAI).
Vendors that offer controller-based storage have been redesigning their storage controllers to handle the increased performance capacity offered by today's SSDs. The addition of dynamic tiering has allowed for highly active data to be automatically serviced by the solid-state storage layer. When configured and tuned properly, this can greatly increase the performance of a workload. Less-frequently accessed data is still stored on rotating media to minimize cost.
Benchmarks versus load generators
For comparison reasons, it's important to discuss the use of benchmarks versus load generators. Many published results have been mislabeled as benchmark results, which can be confusing when you want to compare results. Often, load generators are mistaken for benchmarks because load generators may be used to create a benchmark.
A benchmark is a fixed workload that has reporting rules and a fixed measurement methodology around it so the characteristics can't be changed. Standard industry benchmarks impose further restrictions, often with an independent reviewer who ensures the compliance of the results. This ensures users get an "apples-to-apples" comparison between similar products. There are currently two standard bodies (see the chart "Current industry-standard storage benchmarks") that offer industry-standard benchmarks for storage: the Storage Performance Council (SPC) and Standard Performance Evaluation Corporation (SPEC).
A load generator is a tool used to simulate a desired load for performance characterization or to help reveal performance issues in a system or product. These generators have "knobs" to adjust the desired workload characteristics and are used not only by performance professionals but by testing organizations to validate a product's established specifications. Results often can't be compared with those from other vendors because there's no guarantee that the test conditions were equal while measuring the system under test.
It's important to be aware of these differences since each of the 1 million IOPS results was likely measured under different conditions by the various vendors. Some results have artificially high caching effects that are very hard to reproduce in a normal operational environment and they often lack full disclosure to permit reproducing the test in one's own data center. The chart titled "Popular I/O-generator tools" shows two of the most popular load generators currently used in performance testing labs. There are many other good generators in use that were created by major storage vendors, but these two are noted here not only for their popularity but because they're available for free at http://sourceforge.net.
Popular I/O-generator tools
Comprehensive workload generator that can also replay captured workload traces
Generates uniform I/O loads for speeds-and-feeds information
Most million IOPS results are based on testing with 512 byte blocks. But most enterprise online transaction processing (OLTP) applications use data transfer sizes of 4 KB or 8 KB. Million IOPS performance numbers measured by various vendors have shown that for well-behaved solid-state storage, 4 KB transfers will perform approximately 20% less (yielding approximately 800,000 IOPS) than 512 byte blocks, and some solid-state products that aren't as well behaved may show as much as a 40% drop in performance compared to measurements using 512 byte blocks. In addition, most million IOPS results are more speeds-and-feeds information than measured application or workload performance numbers.
Measuring solid-state storage
Solid-state storage has unique behavior characteristics and, unlike hard disk drives (HDDs), has no moving parts to consider. HDD metrics like rotational latency and seek times don't apply. Because those latencies are eliminated, response times are usually measured in microseconds for solid-state instead of in milliseconds as with HDDs. It's important for end users (consumers of these results) to understand how these measurements are performed to ensure that the reported results represent a verifiable and sustainable level of performance.
Not all solid-state drives perform equally. Single-level cell (SLC) SSDs have faster access times than multi-level cell (MLC) SSDs. DRAM-based solid-state storage is currently considered the fastest, with average response times of 10 microseconds instead of the average 100 microseconds of other SSDs. Enterprise flash devices (EFDs) are designed to handle the demands of Tier-1 applications with performance and response times similar to less-expensive SSDs. EFDs have enterprise-level data protection and management features that are sometimes a small tradeoff in terms of performance, depending on the vendor. Each manufacturer has created its own wear-leveling algorithms, and some algorithms can create large drops in performance over time for write-intensive workloads. Other factors to consider are the storage protocol used for accessing the SSDs. Fibre Channel is still the highest performing protocol, but SAS isn't far behind. iSCSI and SATA perform well with SSDs, but most products built around those technologies won't produce 1 million IOPS results unless they have other caching features to assist performance.
The location of solid-state storage in the I/O path can also be a factor in producing a million IOPS result. Microsecond response times are easier to achieve if the solid-state is located closer to the host. Many vendors have taken advantage of this fact with PCI Express (PCIe) flash cards and SSDs that plug into a host like internal HDDs. To ensure the maximum performance with host-side, solid-state storage, intelligent software from companies like Fusion-io, LSI, Proximal Data, SanDisk and VeloBit can help optimize the performance of host-side SSDs and PCIe flash cards.
Even hypervisor vendors have gotten into the million IOPS reporting act by demonstrating how a single virtual machine can drive 1 million IOPS just like a physical server. VMware used a popular I/O load generator to rack up million IOPS results using a Violin Memory 6000 all-flash storage array. Under different measurement conditions, Microsoft even published a 1 million IOPS result for Windows Server 2012. Both of these results were measured under very different conditions, so this would be an example of two results that can't be competitively compared since each result was measured under different conditions.
Solid-state storage performance measurement
Solid-state storage performance has different performance measurement requirements than the measurements used for HDD performance, so ensuring the published results have followed solid-state storage performance procedures properly is key. There are four main steps that have to be performed to demonstrate sustained solid-state performance:
- Create a common starting point. Solid-state storage needs to be in a known, repeatable state. The popular common starting point is a new SSD that has never been used, or performing a low-level format on an SSD to wipe the contents and restore it to its original state.
- Conditioning. Solid-state storage has to be put in a "used" state. During initial measurements, solid-state will show artificially high performance that's only temporary and not sustainable. These numbers shouldn't be reported as a demonstration of the solid-state's true sustained performance. For example, if random 4 KB writes are run against the storage for approximately 90 minutes, it should put the storage into a "used" state. Depending on the manufacturer, the transfer size or amount of time needed for conditioning may change.
- Steady state. Performance levels will settle down to a sustainable rate; that's the performance level that should be reported.
- Reporting. The level of reporting is important. If a standard benchmark requiring full disclosure wasn't used, there's a minimum amount of information required. The type of I/O is important to know. Most results are reported as 100% random reads because random writes diminish performance. With solid-state storage, most random write workloads don't perform any better than the performance that HDD systems yield. Some results reporting will also disclose the number of outstanding I/Os, which is a "nice to have" piece of information if coupled with a reported average response time.
Even after following these important steps to measure solid-state storage performance, it's still hard to compare results without some kind of comparison criteria for fair use rules. More information about these four steps can be found through the Storage Networking Industry Association's (SNIA) Solid State Storage Initiative (SSSI).
Standards organizations back their SSD benchmarks
Industry-standard benchmarks and other well-accepted benchmarks are the best means to access competitive comparisons with full disclosure on a vendor's product offering. These benchmarks are usually based on application workload and have strict rules around measurement, independent certification and/or audit, and reporting. These rules are in place for the end-user's benefit, asserting that an independent third party has verified the information and the reporting rules were followed. In addition, this ensures that full disclosure is reported with the same information and in the same format for easy consumption and comparison to the results of the tests conducted on other products.
Standards organizations like SPC, SPEC and SNIA SSSI are good examples of industry organizations putting standards in place to ensure the proper measurement of solid-state storage performance. The SPC workloads are based on Tier-1 applications and can't be compared to the 100% random reads results, for example.
Solid-state technology is still maturing, including finding the best ways to sustain long-term high performance from solid-state-based products. By understanding how this high-performance technology is measured, you'll have a better sense of where it might boost the performance of mission-critical applications, and virtualized and cloud infrastructures in the data center.
About the author:
Leah Schoeb is a senior partner at Boulder, Colo.-based Evaluator Group.