Feature

Alternatives to RAID

Ezine

This article can also be found in the Premium Editorial Download "Storage magazine: R.I.P. RAID?."

Download it now to read this article plus other related content.

When a HDD fails in a RAID 5 set, the system will rebuild the data on a spare drive that replaces the failed hard disk drive. The storage system then exercises every sector on every HDD in the RAID set to reconstruct the data. This heavy utilization of the other HDDs in the RAID set increases the likelihood of another HDD failure (usually a non-recoverable read error) by an order of magnitude, which significantly increases the likelihood of a data failure. Ten or 20 years ago when disk capacities were much lower, rebuilds were measured in minutes. But with disk capacities in the terabytes, rebuilds can take hours, days or even weeks. If application users can't tolerate the system performance degradation that rebuilds cause, the rebuild is given a lower priority and rebuild times increase dramatically. Longer data reconstruction times typically equate to significantly higher risks of data loss. Because of this, many storage shops are stepping up their use of RAID 6.

RAID 6 provides a second parity or stripe that protects the data even if two HDDs fail or have a non-recoverable read error in the RAID set. The risk of data loss drops dramatically, but the extra stripe consumes additional usable capacity and system performance will take a bigger hit if two drives must be reconstructed simultaneously from the same RAID group. More disturbing is the increased risk of data loss if a third HDD fails or a non-recoverable read error occurs during the rebuild.

There

Requires Free Membership to View

are other RAID issues such as "bit rot" (when HDDs acquire latent defects over time from background radiation, wear, dust, etc.) that can cause a data reconstruction to fail. Most storage systems include some type of background scrubbing that reads, verifies and corrects bit rot before it becomes non-recoverable, but scrubbing takes system resources. And higher capacities mean more time is needed to scrub.

Another RAID issue is documenting the chain of ownership for replacing a failed HDD, which includes the documented trail (who, what, where, when) of the failed HDD from the time it was pulled to the time it was destroyed or reconditioned. It's a tedious, manually intensive task that's a bit less stringent if the HDD is encrypted. Even more frustrating is that the vast majority of failed HDDs sent back to the factory for analysis or reconditioning (somewhere between 67% and 90%) are found to be good or no failure is found. Regrettably, the discovery happens after the system failed the HDD, the HDD was pulled, the data was reconstructed and the chain of ownership documented. That's a lot of operational pain for "no failure found."

Solid-state storage devices actually exacerbate the aforementioned RAID problems. Because solid-state drives (SSDs) can handle high-performance applications, they allow for storage systems with fewer high-performance HDDs and more high-density, low-performance hard disk drives. Tom Georgens, NetApp's CEO, recently noted that "fast access data will come to be stored in flash with the rest in SATA drives." Lower CapEx and OpEx for the system can end up translating into higher OpEx because of the increase in RAID problems.

These RAID issues have inspired numerous vendors, academicians and entrepreneurs to come up with alternatives to RAID. We categorize those innovative alternatives into the three groups: RAID + innovation, RAID + transformation and paradigm shift.

RAID + innovation

Several vendors have addressed traditional RAID problems by taking an incremental approach to RAID that leverages its reliability while diminishing some of the tradeoffs (see "RAID enhancements," below). IBM's EVENODD (implemented by EMC on Symmetrix DMX) and NetApp's RAID-DP (implemented on NetApp's FAS and V-series) have enhanced RAID 6 by reducing algorithm overhead while increasing performance.

Click here to get a PDF of the RAID enhancements chart.

This was first published in May 2010

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: