Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

SSD caching vs. primary storage for data placement

SSD caching and primary storage are the two main ways to implement solid-state storage technology with respect to data placement. We explain both strategies.

What you will learn: Data placement for solid-state storage technology can be achieved in two ways: via SSD caching or primary storage. Learn about the pros and cons of both methods of implementation and how to determine which is right for your environment.

solid-state storage technology, or solid-state drives (SSDs), is top of mind for many data storage professionals because it provides storage systems with opportunities for tremendous performance improvements. With any SSD project there are two decisions to be made at the outset: the hardware form factor to use and how the data is placed on the solid-state storage.

This technical tip focuses on placing the data. There are two basic ways to implement solid-state storage with respect to data placement: caching and primary storage.

SSD caching for data placement

Caching is a technique where the controller -- whether it's software, a RAID controller inside a server or a high-end external disk controller -- uses solid-state storage technology as a cache in front of traditional disk storage. The caching controller identifies any frequently accessed data, sometimes called "hot data," and automatically moves it to the solid-state media. While different caching controllers may have slightly different caching algorithms, the basic idea is to improve performance by getting the hot data onto the fastest media so I/O performance increases and I/O latency decreases. As I/O patterns change over the course of minutes, hours and days, the caching controllers automatically observe which data is most frequently accessed and moves it onto the fastest media, without any intervention on the part of the user or administrator.

Multiple applications can benefit from the SSD cache simultaneously as the I/O traffic increases because the caching controller simply looks for heavy I/O traffic wherever it occurs and accelerates it, subject to its algorithms and the amount of cache available. With caching systems, performance improves over time as the cache is populated with data. This is sometimes known as “ramp up.” Another advantage of the caching system is that the overall load on the spinning hard disk drives is reduced because frequently accessed data is accessed from the solid-state devices. Some caching systems only cache reads, while others cache reads and writes. Caching technology can be applied to block storage devices and to network-attached storage (NAS) devices.

Primary storage for SSD data placement

With primary storage implementations, the user decides which data to place on the SSD and when to place it there. The user must take a specific action to move the data to the SSD, and the application that uses this data must then be told about its correct location. There are two significant differences between SSD primary storage implementations and SSD caching techniques: In primary storage implementations, only the applications whose data is on the SSDs receive the benefits of the improved performance. And, unlike caching systems where performance ramps up over time, primary storage implementations see performance improvements immediately.

Still, one chief drawback of using solid-state storage in a primary storage implementation is that a good decision for data placement today might not be a good decision tomorrow. For example, if an important application that only runs at month's end needs improved performance, its data must be placed onto the SSD just before the month-end processing begins and then moved off the SSDs after the month-end processing finishes. To resolve this problem, many of the vendors who offer solid-state storage technology provide automation software for primary storage implementations that can help select and move data to the SSDs automatically. These solutions may work at the full LUN level or may be able to operate at a sub-LUN level. In addition, these solutions generally provide policy-based data movement functions where the user can set thresholds for times to promote data to the SSD and times to demote data from the SSD back to the spinning disk. These automated tiering software solutions are at various levels of maturity.

How to choose between SSD caching and primary storage

Some vendors have chosen the SSD caching approach, while others have taken the primary storage approach. Because each approach has advantages, it turns out that many of the vendors who started with the caching approach have added primary storage as an option to their solutions, and many of the vendors who started with the primary storage approach are now adding caching as an option to their solutions.

Ultimately, it's up to users to decide which approach is best for their specific environment. We've run performance tests in our lab looking at both types of solid-state storage implementations and published results for some of these tests. Look at our test results on the Demartek SSD zone.

BIO: Dennis Martin is president at Demartek LLC, an Arvada, Colo.-based industry analyst firm that operates an on-site test lab.

Dig Deeper on Hybrid flash arrays