What you'll learn in this tip: Today's RAID systems have moved far beyond the RAID levels set in the 1980s. Find out why technologies like wide striping, storage virtualization and erasure coding are changing the basic assumptions of RAID.
There may be no technology more intertwined with the enterprise data storage industry than RAID. That's because combining multiple physical hard disk drives (HDDs) into a single virtual drive improves performance and reliability. The RAID systems of 2011 have moved far beyond traditional disk configurations, known as RAID levels, laid out in a seminal 1988 paper "A Case for Redundant Arrays of Independent Disks (RAID)" Today, data protection is increasingly relying on advanced mathematics known as erasure coding. As a result, many have already declared the death of RAID, but the future of data protection will certainly keep RAID technology at the forefront.
Outdated RAID concepts
By way of background, the 1988 paper proposed RAID as a solution to "the pending I/O crisis." At the time, advances in CPU and memory performance were threatened due to underperforming and expensive hard disk drives. RAID was designed to leverage inexpensive PC hard disk drives as an alternative to the "single, large, expensive disks" (SLEDs) used in mainframes at the time.
Today, two of the proposed five "RAID levels" or configurations, remain popular: RAID 1, which mirrors data; and RAID 5, which distributes both data and parity check information across multiple disks. However, RAID 3 and RAID 4 have found use cases. Both of these distribute data across disks, but use a single drive for parity check data.
All five original RAID levels share certain common elements. They entirely encapsulate a number of hard disk drives, using all of their capacity; the groupings of drives are distinct and inflexible; and the mathematics used are simple and require little computational power. These were reasonable design decisions for the 1990s, but CPU power and storage capacity have moved on since then. CPU power has increased: Where simple XOR math was once a challenge, today's controllers can compute advanced Reed-Solomon codes in real-time. Hard disk drive performance has lagged behind capacity; throughput-per-MB and IOPS-per-MB have both decreased more than a thousand-fold in the last two decades.
This is the core issue that post-RAID data protection systems seek to address: Modern systems bear little resemblance to those that RAID was designed for. Rather than a single mainframe using 100 small drives and simplistic controllers, today's enterprise storage systems have incredible processing power and serve hundreds of clients simultaneously over a storage-area network (SAN) or network-attached storage (NAS) network. Rigid sets of hard disk drives make little sense in the modern data center.
Next for RAID: wide striping, storage virtualization and erasure coding
Storage manufacturers have been quick to modify and adapt RAID levels to meet the needs of their customers. Technologies like wide striping, storage virtualization and erasure coding are changing the basic assumptions of RAID. Much of this work was unheralded and invisible to customers, however, and the old nomenclature persists.
EMC Corp., Hewlett-Packard (HP) Co. and others abandoned the whole-disk concept in the mid-1990s, building RAID 1 and RAID 5 sets from slices of capacity spread across multiple drives. This was taken further in the 2000s by companies like 3PAR and Compellent Technologies Inc., whose "wide striping" technology places just a little data on each hard disk drive. Spreading data across many more drives improves average performance and reduces the time required to rebuild a RAID set in the event of a failure. Although many arrays still rely on rigidly defined disk groups, most high-end devices spread data more widely.
Like its server-based cousin, storage virtualization breaks the rigid link between physical systems and their logical representation. Virtualized arrays present drives and file systems to servers that aren't tied to a specific set of disks. This allows them to freely move this data between RAID sets, hard disk drives, flash storage and even across multiple arrays. Conventional RAID might still be used at the lowest level, but storage virtualization overcomes its inflexible layout and performance limits.
As discussed in my August Tech Tip, erasure coding is a new kind of data protection math that goes well beyond the simple parity checks used by classic RAID systems. Although often referred to as "dual parity," most implementations of RAID 6 actually employ advanced Reed-Solomon coding, bringing many advantages over basic parity calculation. These systems can not only recover lost data, they can detect corruption of data. Some systems disperse data widely across drives, storage nodes and geographies for even greater reliability. Although these calculations were widely known in the 1980s, computing power hadn't advanced far enough to utilize them in storage arrays.
Living in the post-RAID world
Today's enterprise storage systems are just as likely to employ these modern data protection schemas as they are to use classic RAID levels, and most are at least somewhat-virtualized. Data storage buyers are likely to encounter any number of new technologies in combinations that make them difficult to assess. It's therefore important to discard outdated "rules of thumb" regarding RAID and focus instead on real-world performance and manageability of systems. Once, the only way to achieve high performance was to combine RAID 1 and data striping (also called "RAID 0") into a "RAID 1+0" or "RAID 10" set. But modern systems with DRAM and flash caches, wide striping and automated tiering can perform even better without the 50% capacity hit of RAID 1. Similarly, database administrators are loath to use RAID 5 due to the limited performance of classical implementations. But today's systems can overcome these issues, delivering more performance than the basic mirrored disks DBAs often request.
Advances in technology have made RAID technology more common, but not all RAID systems are equal. The power of the CPU and capacity of the cache in an array have much more to do with performance than the arrangement of the disk drives. And disk drives with greater capacity can make a small array appear to be a decent alternative to a larger system, though performance will surely suffer. Put simply, one can't assume that a given system will perform.
The best strategy for storage buyers is to examine the real-world performance of a storage device rather than making assumptions based on RAID levels. They should request references from vendors and examine how a given system supports their applications. RAID is not dead, but the critical issues in enterprise storage have moved beyond it.
BIO: Stephen Foskett is an independent consultant and author specializing in enterprise storage and cloud computing. He is responsible for Gestalt IT, a community of independent IT thought leaders, and organizes their Tech Field Day events. He can be found online at GestaltIT.com, FoskettS.net, and on Twitter at @SFoskett.