Getting the most from your storage system is a basic requirement, but with many efficiency techniques you’ll have...
to decide if performance tops capacity utilization.
A challenging economy may have been the original impetus for improving the efficiency of installed storage systems, but now it’s just a way of life for data storage managers. Dismal disk capacity utilization of well under 50% and underutilized storage controllers with only a fraction of their processing resources tapped is increasingly unacceptable in most IT shops. Inefficient storage operations have become a sore spot in data centers where server consolidation and virtualization have been able to maximize the use of physical server resources.
Server virtualization has not only set an efficiency precedence that storage systems must follow, it’s coercing storage systems to become more efficient. Utilization of a storage array that lacks the ability to allocate capacity and storage resources granularly declines as the number of servers connected to it increases. To maximize the impact of data center consolidation, server virtualization and efficient storage must go hand-in-hand.
Storage efficiency is a combination of maximizing the use of both available storage capacity and processing resources, which often turn out to be competing efforts. “In order to achieve a required number of IOPS, a storage system may be designed with a very large number of spindles with dismal capacity utilization and, vice versa, very high capacity utilization achieved by leveraging thin technologies, [such as] compression and deduplication, [which] may have adverse performance implications,” said Greg Schulz, founder and senior analyst at Stillwater, Minn.-based StorageIO. Achieving efficient storage requires a balancing act of optimizing the utilization of available resources (capacity and processing power), performance and cost under the governing boundaries of application requirements.
The fundamentals of storage efficiency
Performance optimization of a storage system requires maximizing storage processing resource utilization while minimizing storage system congestion. As processing utilization increases, the remaining processing resources and the ability to serve additional requests decreases; if that buffer is too small, the likelihood of hitting a performance limit increases. For instance, an array that operates at 50% average performance utilization is less likely to hit performance limits than one that has an 80% average utilization.
Reporting, monitoring and storage analytics that come with storage systems, or are available as add-ons from array vendors or third-parties, are critical components to optimizing performance utilization and adding processing resources as needed, rather than oversizing an array or reacting after the fact. “Whether it’s performance or capacity, measuring metrics and knowing what you have through tools from storage vendors or third parties is the most important thing,” Schulz said.
These tools not only help identify performance issues, they’re important in determining the appropriate cure. Performance of a storage system isn’t determined by a single metric but by a combination of factors that affect how fast data can be served to apps. IOPS, throughput and latency are the key factors, but they vary depending on the workload (random vs. sequential), block size (large vs. small) and transaction type (read vs. write), and their relevance and impact on performance is application dependent. For instance, when streaming video applications, fast sequential reads of bigger files and larger blocks prevail, but randomized reads are usually the predominant transactions found in virtualized environments.
We’ll now look at techniques for optimizing performance and storage resource utilization. Since there isn’t a magic storage bullet, each approach has its merits and disadvantages.
Add more disk drives for more performance
Disk drives are mechanical devices in which read and write heads move between the center and periphery of a fast rotating platter to find and access data. Even with the fastest disk drives that operate at 15,000 rpm and the time it takes for mechanical arms to reposition, latency adds up to a few milliseconds, limiting the number of IOPS per disk to a few hundred per second and throughput to less than 100 MBps.
One way to scale performance is by spreading data across multiple disks that work in unison when data is accessed, enhancing the number of IOPS and throughput proportionally to the number of disks involved. Additionally, array vendors have implemented techniques like short stroking to minimize arm movements. By placing data on the periphery of a platter, read-write head movements are greatly reduced, resulting in a significant performance boost. While a given performance goal can be achieved with a large number of disks and short stroking, it’s very costly and by only using the outer parts of platters storage utilization is dismal. Prior to the emergence of solid-state drives (SSDs), leveraging a large number of disks and methods like short stroking were used to meet high-performance requirements, and even today it’s used for applications where the high cost of solid-state storage still favors disk drives over SSD. “For sequential access of larger blocks and files, disk is usually more cost-effective,” said Mike Riley, director of strategy and technology of America sales at NetApp.
RAID and wide-striping
Easily overlooked, RAID and RAID levels both impact performance and capacity; changing the RAID level of an existing array to improve either performance or capacity utilization is a feasible option. The number of parity drives, large vs. smaller stripes, the size of RAID groups, and the block size within RAID groups all impact performance and available capacity.
While the characteristics of standard RAID levels are well known (see “RAID levels and their impact on performance and capacity utilization,” above), there are a couple of lesser known trends that deserve special attention in an efficient storage discussion. To start with, the size of a RAID group impacts performance, availability and capacity. Usually, larger RAID groups with a higher number of disks are faster, but require more time to rebuild in case of a disk failure. With high-capacity disks doubling in size every few years, rebuild times are increasing and the risk of more than one drive failing rises. Even though RAID 6 with its dual-parity drives reduces the risk by permitting two concurrent disk failures with some performance penalty, a better approach is eliminating dedicated parity drives. For instance, NetApp’s Dynamic Disk Pools (DDP) distribute data, parity information and spare capacity across a pool of drives, and utilize every drive in the pool for the intensive process of rebuilding a failed drive. Hewlett-Packard (HP) Co.’s 3PAR storage systems deploy a technique called wide striping that stripes data across a larger number of disks and subdivides disks’ raw storage capacity within this pool into small “chunklets.” The 3PAR volume manager uses these “chunklets” to form micro-RAIDs with parity “chunklets.” Because all “chunklets” of a single micro-RAID are located on different drives and are small in size, the risk and performance impact during drive failures and subsequent rebuilds is minimized.
Without question, the advent of solid-state storage has been a disruptive event that has given storage system vendors a completely new set of options to optimize both performance and capacity. With an order of magnitude less latency (microseconds vs. milliseconds) and several orders of magnitude higher number of IOPS (north of 10,000 vs. a few hundred) than disk drives, flash (and more often a combination of hard disks and flash) is giving storage managers a cost-effective alternative to the traditional approach of spreading data across a large number of disks to achieve high storage performance. Best suited for applications with a large number of random reads, such as hypervisors, solid-state storage is less adept for sequential access of larger blocks and large files.
“Flash is still an order of magnitude more expensive than high-end disk drives and it’s therefore important to use it wisely and deploy it for the right tasks,” said Eric Herzog, senior vice president, product management and marketing for EMC’s Unified Storage Division.
Today, solid-state storage can be deployed in three ways:
Solid-state disks in place of mechanical disks. Replacing disk drives with SSDs is the simplest way of boosting array performance. When opting for this route, however, it’s crucial to verify with the array vendor the impact of SSD drives on the array and to heed the vendor’s guidelines. SSDs can wreak havoc on a storage system if the storage system’s processors can’t support the high performance of the solid-state storage. A performance problem can quickly turn from bad to worse if SSDs overwhelm storage controllers. Another issue relates to the mechanics of how and when data is moved to and off solid-state drives. In its simplest and least preferable form, SSD can be allocated manually to certain applications, such as database log files. While this may be the only option for older arrays, more automated mechanisms, such as EMC’s Fully Automated Storage Tiering (FAST), are preferable.
Flash as cache on storage systems. Using flash as cache to extend the relatively small DRAM cache avoids many of the challenges associated with substituting disk drives with SSD drives. Since a flash cache is part of the storage system architecture, storage controllers are designed to support whatever amount of flash the array permits. Flash as cache also resolves the tiering challenge. By definition, a cache will always contain the most active data while stale data resides on mechanical disks. While solid-state drives only benefit data that resides on SSD, a flash cache benefits all data that traverses the storage system. It’s difficult to find any drawbacks of a flash cache, but one of its downsides is that it’s only an option in newer arrays. Complementing high-capacity drives with flash cache is becoming a more common storage architecture choice because it enables arrays that combine both high capacity and high performance.
“By adding 2% to 3% of SSD to a disk-based array, you can almost double the throughput,” said Ron Riffe, business line manager, storage software at IBM.
Flash in servers. The closer data is to server processors and memory, the better the storage performance. Placing flash storage in servers via PCIe cards from the likes of Fusion-io yields optimal storage performance. On the downside, flash storage in servers usually isn’t shared and can only be used by applications that reside on the server, and it’s very expensive. Nevertheless, extending storage into servers is actively pursued by NetApp, with an initiative to make Data Ontap available to run on hypervisors, as well as EMC with its VFCache, formerly known as Project Lightning. It’s obviously the goal of both vendors to provide a very high-performance, server-side flash storage tier that integrates seamlessly with their external storage systems.
Storage acceleration appliances
Storage acceleration devices are placed between servers and storage systems with the intent of either increasing performance or providing additional storage services, such as virtualization. Because most of them work with existing heterogeneous storage systems, they should be on anyone’s evaluation list when battling a storage efficiency or performance problem. Moreover, their ability to boost performance of existing arrays and pull existing heterogeneous storage into a single storage pool helps extend the life of existing storage systems while providing an overall increase in storage performance and lowering storage costs.
The recently released IBM SmartCloud Virtual Storage Center is a prime example of a product in this category. It combines storage virtualization, leveraging the SAN Volume Controller (SVC) software, storage analytics and management into a single product. It gathers heterogeneous physical storage arrays into a single virtualized storage pool, and supports thin provisioning of volumes from this pool; it recognizes and supports flash in attached storage systems but also allows adding flash to the SmartCloud Virtual Storage Center itself. It identifies high I/O data and hotspots via real-time storage analytics and automatically moves data from disk drives to flash and vice versa. These capabilities enable SmartCloud Virtual Storage Center to significantly increase performance and capacity utilization of existing heterogeneous storage systems. While IBM SmartCloud Virtual Storage Center is block-based, file system-based acceleration devices are available from the likes of Alacritech and Avere Systems.
Increasing capacity utilization
Maximizing the use of available disk space is accomplished through the use of thin technologies, such as thin provisioning and thin clones, compression and data deduplication. All these techniques have a common goal of not storing duplicate data while maximizing referencing of existing blocks of data. While thin technologies have little impact on storage performance, deduplication and compression usually do and should therefore only be turned on if the performance impact is clearly understood and acceptable.
Performance and capacity: Inseparable
Storage performance and capacity utilization are joined at the hip, and improving one often has adverse implications for the other. Storage analytics and reporting of actual array performance is a prerequisite for identifying chokepoints and remedying them appropriately. Typically, improving storage efficiency comes down to balancing performance requirements and cost.
BIO: Jacob N. Gsoedl is a freelance writer and a corporate director for business systems.