Published: 01 May 2010
The various forms of RAID have been around for a long time and have done a good job of protecting data. But high-capacity...
drives and new performance demands have spurred development of RAID alternatives.
Redundant array of independent disks (RAID) has been the standard for disk-based data protection since 1989, and is a proven and reliable method that's considered a basic data storage building block. Basic storage principles tend to change very slowly and, despite its popularity and track record, change is coming to RAID.
To gain more insight into why an alternative to RAID might be appealing requires some understanding about RAID and the growing problems with the technology.
RAID shortcomings in the 21st century
The purpose of RAID is to protect data in the event a hard disk drive (HDD) fails. When that failure occurs, data from that failed HDD (or multiple HDDs) is recreated from parity or copied from a mirror, depending on the type of RAID in use. Disk drives are electro-mechanical devices that have the highest probability of a failure and the lowest mean time between failures (MTBF) in any storage system.
It takes a lot of HDDs to keep up with the rapid growth rate of data storage that analyst firms like IDC, Gartner and Enterprise Strategy Group peg at somewhere between 50% and 62% per year. Statistically speaking, more hard disk drives mean more HDD failures. Disk drive manufacturers have continually increased HDD density, and today we have 2 TB SATA and are likely going to 4 TB by the end of this year. Even high-performance SAS and Fibre Channel (FC) drive capacities are pushing 600 GB. RAID problems quickly become evident when a rebuild is required with those increasingly dense drives.
Each RAID type (see "Traditional RAID levels," below) has tradeoffs in write performance, read performance, level of data protection, speed of data rebuilds and the usable storage on each hard disk drive. For example, if guaranteeing data availability is the top priority, then some variation of mirroring or multiple mirrors (RAID 1, 10, triple mirror, etc.) will be required. Having full copies of the data on other HDDs or RAID sets simplifies protection and recovery of the data but at a severe and tangible cost because each mirror reduces usable storage by the same amount of the original data. In addition, system resources are required for every copy, which can impair I/O performance. Realistically, most organizations aren't this overprotective; most use RAID 5 and/or RAID 6.
Click here to get a PDF of the Traditional RAID Levels chart.
When a HDD fails in a RAID 5 set, the system will rebuild the data on a spare drive that replaces the failed hard disk drive. The storage system then exercises every sector on every HDD in the RAID set to reconstruct the data. This heavy utilization of the other HDDs in the RAID set increases the likelihood of another HDD failure (usually a non-recoverable read error) by an order of magnitude, which significantly increases the likelihood of a data failure. Ten or 20 years ago when disk capacities were much lower, rebuilds were measured in minutes. But with disk capacities in the terabytes, rebuilds can take hours, days or even weeks. If application users can't tolerate the system performance degradation that rebuilds cause, the rebuild is given a lower priority and rebuild times increase dramatically. Longer data reconstruction times typically equate to significantly higher risks of data loss. Because of this, many storage shops are stepping up their use of RAID 6.
RAID 6 provides a second parity or stripe that protects the data even if two HDDs fail or have a non-recoverable read error in the RAID set. The risk of data loss drops dramatically, but the extra stripe consumes additional usable capacity and system performance will take a bigger hit if two drives must be reconstructed simultaneously from the same RAID group. More disturbing is the increased risk of data loss if a third HDD fails or a non-recoverable read error occurs during the rebuild.
There are other RAID issues such as "bit rot" (when HDDs acquire latent defects over time from background radiation, wear, dust, etc.) that can cause a data reconstruction to fail. Most storage systems include some type of background scrubbing that reads, verifies and corrects bit rot before it becomes non-recoverable, but scrubbing takes system resources. And higher capacities mean more time is needed to scrub.
Another RAID issue is documenting the chain of ownership for replacing a failed HDD, which includes the documented trail (who, what, where, when) of the failed HDD from the time it was pulled to the time it was destroyed or reconditioned. It's a tedious, manually intensive task that's a bit less stringent if the HDD is encrypted. Even more frustrating is that the vast majority of failed HDDs sent back to the factory for analysis or reconditioning (somewhere between 67% and 90%) are found to be good or no failure is found. Regrettably, the discovery happens after the system failed the HDD, the HDD was pulled, the data was reconstructed and the chain of ownership documented. That's a lot of operational pain for "no failure found."
Solid-state storage devices actually exacerbate the aforementioned RAID problems. Because solid-state drives (SSDs) can handle high-performance applications, they allow for storage systems with fewer high-performance HDDs and more high-density, low-performance hard disk drives. Tom Georgens, NetApp's CEO, recently noted that "fast access data will come to be stored in flash with the rest in SATA drives." Lower CapEx and OpEx for the system can end up translating into higher OpEx because of the increase in RAID problems.
These RAID issues have inspired numerous vendors, academicians and entrepreneurs to come up with alternatives to RAID. We categorize those innovative alternatives into the three groups: RAID + innovation, RAID + transformation and paradigm shift.
RAID + innovation
Several vendors have addressed traditional RAID problems by taking an incremental approach to RAID that leverages its reliability while diminishing some of the tradeoffs (see "RAID enhancements," below). IBM's EVENODD (implemented by EMC on Symmetrix DMX) and NetApp's RAID-DP (implemented on NetApp's FAS and V-series) have enhanced RAID 6 by reducing algorithm overhead while increasing performance.
Click here to get a PDF of the RAID enhancements chart.
NEC Corp.'s RAID-TM or triple mirror (implemented in its D-Series systems) aims to solve RAID 1 data loss risk if both the primary and mirror drive fail or if there's a non-recoverable read error. RAID-TM writes data simultaneously to three separate HDDs so if two HDDs fail or there are unrecoverable read errors in the same mirror, the app still has access to its data with no degradation in performance even as the drives are rebuilt. The advantage is performance; the disadvantage is far less usable capacity.
RAID-X is an IBM XIV Storage System innovation that uses a wide stripe to reduce RAID tradeoffs of performance and data loss risk. It's basically a variation of RAID 10 that uses intelligent risk algorithms to randomly distribute block mirrors throughout the entire array. This approach allows XIV to reconstruct the data on very large 2 TB HDDs in less than 30 minutes. As with all mirroring technology, the tradeoff is reduced usable capacity.
Hewlett-Packard Co.'s LeftHand Networks and Pivot3 Inc. provide similar variations of Network RAID for their x86-based clustered iSCSI storage. Network RAID leverages the concept of RAID, but uses storage nodes as its lowest component level instead of disk drives. This allows it to distribute a logical volume's data blocks across the cluster with one to four data mirrors depending on the Network RAID level. Ongoing block-level, self-healing nodal health checks allow Network RAID to copy and repair the data to another node before a failure occurs. This decreases the probability of a hard disk drive fault or non-recoverable read error causing a performance-sapping rebuild; but like all mirroring technology, it reduces the amount of usable storage.
These are just some of the RAID + innovation technologies. Others are currently incubating, including proposals for RAID 7 (triple parity and more) or TSHOVER (triple parity).
RAID + transformation
There are also RAID alternatives that attempt to re-invent RAID. They typically use RAID and are layered on top of it in some way. The concept is to keep what's good about RAID and fix the rest. Examples of transformation technologies include self-healing storage and BeyondRAID.
Self-healing storage: Xiotech Corp.'s Intelligent Storage Elements (ISE) is a good example of self-healing storage. ISE tightly integrates RAID and HDDs, and combines them into a single storage element.
Xiotech engineered ISE to resolve most RAID rebuild issues by eliminating 67% to 90% of the rebuilds. It starts by reducing HDD faults by proactively healing hard disk drives before a fault occurs using similar HDD reconditioning algorithms employed by the factory. It also uses advanced vibration controls and sealed systems called DataPacs to reduce outside influences from causing HDD faults. When a fault does occur, it reacts by providing remedial component repair within the sealed DataPac using methods similar to what the original manufacturer uses. It analyzes power cycles, recalibrates components, remanufactures the HDD, and migrates data when required to other sectors or HDDs. If the fault persists, ISE will isolate just the non-recoverable sectors and then initiate data reconstruction only for the faulted HDD sectors. So there are far fewer rebuilds and, when one is required, there's much less to reconstruct. In addition, it's all automated so no manual intervention to pull failed drives is required. The result is equivalent to a factory-remanufactured HDD with only the components that are beyond repair taken out of service. The downside to this transformational technology is that it has higher up-front costs, although it lowers the total cost of ownership (Xiotech provides a five-year warranty).
Atrato Inc.'s Velocity1000 (V1000) uses a self-healing technology called Fault Detection, Isolation Recovery (FDIR) in combination with Atrato's Virtualization Engine (AVE). FDIR watches component and system health, and adds self-diagnostics and autonomic self-healing, but it doesn't attempt to remanufacture or recondition HDDs in place as Xiotech does. Atrato puts 160 2.5-inch SATA drives in a 3U system called SAID (self-maintaining array of independent disks). Atrato uses its extensive SATA drive performance database of operational reliability testing (ORT) to monitor the installed drives actual performance to detect SATA HDD deviations. Atrato also deals with HDD faults by first attempting to repair the faulting HDD sectors (although not with manufacturer-level reconditioning, remanufacturing or component recalibration). If the fault or non-recoverable read error can't be repaired, the sector is isolated and only the affected data is reconstructed and remapped to virtual spare capacity. If a disk drive completely fails, it's reconstructed and remapped to the virtual spare capacity. Atrato reduces the number of rebuilds and rebuild times by reconstructing only affected data on virtual drives. Atrato backs its technology with a three-year warranty.
DataDirect Networks Inc.'s DDN S2A technology heal-in-place approach to disk failure attempts several levels of HDD recovery before a hard disk drive is removed from service. It begins keeping a journal of all writes to each HDD showing behavior aberrations and then attempts recovery operations. When recovery operations succeed, only a small portion of the HDD requires rebuilding using the journaled information so rebuild times are reduced and a service call may be avoided.
Panasas Inc.'s ActiveScan technology continuously monitors HDDs and their contents to detect problems. ActiveScan monitors data objects, RAID parity, disk media and the disk drive attributes. When a potential problem is detected, data is moved to spare blocks on the same disk. Future HDD failure is predicted through the use of HDD self-monitoring analysis and reporting technology (SMART) attribute statistical analysis, permitting action to be taken to protect data before a failure occurs. When an HDD failure is predicted, user-set policies facilitate preemptively migrating the data to other HDDs, which eliminates or mitigates the need for reconstruction.
LSI Corp. and NEC both detect HDD sector errors while allowing operations to continue with the other drives in the RAID group. If an alternative sector can be assigned, the HDD is allowed to return to operation, avoiding a complete rebuild. Performance is maintained throughout the detection and repair process. This is a limited self-healing technology that reduces the number of rebuilds and helps maintain performance.
3PAR's InSpire Architecture is engineered to sustain high performance levels by leveraging advanced HDD error isolation to reduce the amount of data that requires reconstruction, and by taking advantage of its massive parallelism to provide rapid rebuilds (typically fewer than 30 minutes). The system uses "chunklets" in their many-to-many drive relationships. That same massive parallelism allows 3PAR to isolate RAID sets across multiple drive chassis to minimize the risk of data loss if a chassis is lost.
BeyondRAID: Data Robotics Inc.'s BeyondRAID sits on top of RAID and makes it completely transparent to the administrator. It transforms RAID from a deterministic offline process into an online dynamic one. Essentially self-managing, BeyondRAID chooses RAID sets based on the required data protection at any given point in time. But it's how BeyondRAID addresses RAID issues that truly makes it stand out. It protects against one or two HDD failures and has built-in automatic data self-healing (not storage self-healing). Data blocks are spread across all of the drives so data reconstruction is very fast. Because the system is "data aware," it allows for different drive sizes, drive re-ordering and proportional rebuild times. Because it tops out at 8 SATA drives, it's most appealing for small- and medium-sized businesses (SMBs), but it's a true turn it on, hook it up and forget it storage system.
RAID paradigm shift: Erasure codes
Erasure codes are designed to separate data into unrecognizable chunks of information with additional information added to each chunk that allows any complete data set to be resurrected from some subset of the chunks. The chunks can be distributed to different storage locations within a data center, city, region or anywhere in the world.
Erasure codes have built-in data security because each individual chunk doesn't contain enough information to reveal the original data set. A large enough subset of chunks from the different storage nodes is needed to fully retrieve the total data set, with the number of required chunks determined by the amount of additional information assigned to each chunk. More additional information means fewer chunks are required to retrieve the whole data set.
Erasure codes are resilient against natural disasters or technological failures because only a subset of the chunks is needed to reconstitute the original data. In actuality, with erasure codes there can be multiple simultaneous failures across a string of hosting devices, servers, storage elements, HDDs or networks, and the data will still be accessible in real time.
Also known as forward error correction (FEC), erasure coding storage is a completely different approach than RAID. Erasure codes eliminate all of the RAID issues described here. It's a new approach and at this time only three vendors have erasure code-based products: Cleversafe Inc.'s dsNet, EMC Corp.'s Atmos and NEC's HYDRAstor.
Erasure codes appear to be better suited for large data sets than smaller ones. It's especially appropriate for cloud or distributed storage because it never has to replicate a data set and can distribute the data over multiple geographic locations.
The issues with traditional RAID are well known, and are escalating with higher disk capacities. The RAID alternatives described here address many of those problems, and more new approaches are on the way. Selecting the best fit for a particular environment requires research, testing, pilot programs, patience and a willingness to take a risk with a non-traditional approach.
BIO: Marc Staimer is founder and senior analyst at Dragon Slayer Consulting in Beaverton, Ore., a consulting practice that focuses on strategic planning, product development and market development for technology products. You can contact him at firstname.lastname@example.org.