The benefits of solid-state state storage are clear; it’s fast, cool and sips power. But the technology is also changing the fundamental ways we use data center storage.
Solid-state storage has brought a slew of changes to data storage environments and reshuffled the way we approach ongoing storage operations. Solid-state has revived interest in automated tiering, caching applications and data compression, along with providing high-performance persistent storage.
Solid-state storage is not only transforming the storage industry, it’s making waves across the entire computing industry. We’ve seen how flash storage has completely revolutionized the consumer electronics space, replacing spinning disk drives in virtually every category of consumer devices.
This same enthusiasm for flash storage is spreading to the data center. Database administrators, system admins and application owners have become aware of solid-state storage and the benefits it brings. They recognize the performance and power consumption benefits, but still have some concerns about flash endurance supporting enterprise applications. However, the storage industry is addressing flash endurance with newer flash controllers that can extend the life and performance of lower cost flash media so it can be used in enterprise applications in place of more expensive enterprise flash media.
Best fit for SSD
Solid-state drives (SSDs) provide a viable, faster alternative to hard disk drives (HDDs). But the first step in determining the correct storage for a job is defining the application’s specific storage performance requirements. Those requirements should determine whether solid-state storage or traditional hard disk storage is the most appropriate and most cost-effective solution.
Key SSD management terms
- Automated tiering software moves hot data to the solid-state drive (SSD) tier using policy set by the administrator to improve performance.
- Caching copies hot data to the SSD to improve performance.
- Compression shrinks data so that it consumes less storage capacity than in its “native” state.
- Latency is the time it takes for a storage device to respond to a request. Lower latency is better.
We’ll start with a relatively simple example and then move up to more complex circumstances. One area where SSDs are already creeping into the data center environment is in laptop computers. An SSD in a laptop PC provides very fast boot-ups and overall performance, and will also extend battery life significantly as the internal SSD uses very little electric power. Applications such as word processing, large documents with graphics, spreadsheet macros, database and video will all respond very quickly. Copying files to or from the laptop is also very fast. So the effects of using solid-state storage go beyond just application performance.
The same benefits can also be realized with desktop PCs. Using solid-state storage to boot computers, for example, is a relatively easy and inexpensive way to improve performance. Installing an SSD as the boot drive in a desktop computer can extend the life of an older machine simply because much of the I/O is accelerated. This can also work for older laptop computers, if you get an SSD with the correct interface.
Speeding up database operations
Any application that needs improved performance or lower storage latency is a good candidate for solid-state storage technology. For example, many database operations are actually a sequence of several small requests grouped together as a package, such as table scans and queries, which are executed sequentially where the output of one request becomes the input of the next request and so on. The final answer isn’t returned to the application until all the smaller requests comprising the entire transaction have been satisfied. In these cases, the significantly reduced latency (faster turnaround) that solid-state storage delivers can make a huge difference in the overall performance of the application or end-user experience.
The best enterprise hard disk drives have an average seek time latency of approximately 2ms for every request, and not every storage system enables the cache memory on the drives due to data protection concerns. So even if the SSDs in use had the exact same performance as the hard drives, the SSDs would provide better overall latency because they have no seek time. Imagine running a large batch of complex database transactions where every I/O is subject to the seek time latency of good enterprise hard drives; then imagine that same batch of complex database operations without the seek time latency and with faster storage devices, and you’ll see why SSDs are so good for database applications.
SSD enables tiering
Performance gains with solid-state storage technology aren’t limited to database applications. This is why we see an increased demand for caching and tiering solutions. Most servers -- whether singly or in groups -- are kept busy with a variety of application workloads, each having various busy times and slow times. If you have all your data on solid-state storage, you might not need to consider caching and tiering; but if your data center is like the majority of data centers, most of your current application data is kept on some type of spinning hard disk drives.
With tiering for SSDs, the user decides what data to place on the SSD and when to place it there. Tiering can be performed manually or with automated tiering software on the host or in the storage controller. Tiering is all about moving specific hot data to the SSD tier at the right time and moving it back to the slower disk tiers (again, at the right time). If tiering is performed manually, then the administrator must observe the I/O activity over time and decide when to move certain files or data. You would have to manually track the number of accesses of every file on all your systems and then decide when to move files to and from SSDs based on those accesses. For systems of any size, this would be an impossible task to do manually, so automated tiering software would be required. With automated tiering software, the file and data accesses are tracked automatically and data movement occurs at a scheduled time based on user-defined policies. Tiering only benefits the apps whose data is moved to the faster tier, but the performance boost is immediate and significant. Automated tiering solutions are a good choice if you have several applications you believe need the performance boost but you can’t or don’t want to decide -- or you don’t have the time to prove -- which apps need the performance boost. If you only had one application that could benefit from tiering, you wouldn’t need automated tiering software. But most data centers have dozens, hundreds or maybe even thousands of applications that could benefit from higher performance.
Another approach is SSD caching. Caching for SSDs is determined by host software or the storage controller, but it places a copy of the data into the SSD cache without moving the data from its original location that’s known to users and applications. Caching is relatively simple to manage because nearly all the decisions are made by the caching software or controller. Caching benefits any application whose data is considered “hot” within the scope of data accessible to the cache, but the performance improvement is a bit more gradual, increasing as more data is placed into the cache. This gradual performance improvement is called “warm-up” or “ramp-up,” and it can occur over minutes or hours, depending on the implementation and number of I/O operations occurring. Caching can be read-only or for both reads and writes, depending on the implementation. Caching with SSDs follows many of the same caching algorithms used for memory caching or even caches inside of processors. Some SSD caching solutions will not only cache the obvious hot data, but may pre-fetch adjacent data the caching software believes might become hot based on the I/O patterns observed. Most caching solutions let the admin decide which files or volumes are eligible for the cache performance boost, so you can exclude certain data from clogging the cache. If you believe that most or all of your applications would benefit from a performance boost, you should consider SSD caching.
With tiering and caching, SSDs can be added to the configurations to allow more capacity for the performance boost. For either of these solutions, you’ll have to figure out how much SSD capacity is need to make a difference. Many environments seem to need as little as 3% or as much as 10% of the total disk storage capacity in SSD technology to get a significant performance boost.
Data compression with SSD
Compression is another topic gaining a fresh look because of solid-state storage technology. Because SSD technology is generally more expensive than hard drive technology, when looking only at price per gigabyte, one way to increase the benefit of SSD technology is to compress the data before placing it on the SSDs, thereby consuming smaller amounts of a more precious resource. These days, with processors gaining in performance, it may make more sense to spend some extra CPU cycles compressing data to place it on significantly faster storage devices, therefore increasing overall performance. This can make sense whether the compression occurs in the host or in the storage system. When compression is enabled on some storage systems, the data is compressed immediately upon entering the storage system, so that any cache, SSD or disk device only sees the compressed data, so less capacity is consumed at every stage.
Another solution, mentioned above, is to have all-flash storage arrays. This will become more common as the big storage vendors take up the trend begun by some startup companies last year. We can fully expect to see flash arrays from the big storage vendors announced this year. It’s also very likely that in the near future, all-flash arrays will have the same advanced features that current hard drive-only systems have today, including things like thin provisioning, data deduplication and more.
BIO: Dennis Martin has been working in the IT industry since 1980, and is the founder and president of Demartek, a computer industry analyst organization and testing lab.