Pervasive as it might be, traditional RAID is vulnerable to failures due to software bugs, mechanical failures...
at the drive level or bit errors that occur during recording. Chances are high that anyone operating RAID storage will have a problem with the array that will require a data recovery operation -- often performed by a RAID recovery service.
The idea behind RAID protocols is simple: using one of five original schemes, data is made recoverable through redundancy -- by making a copy of the data (or of data required to reconstitute data) on one or more disks in an array.
RAID is designed to recover from a certain set of discrete fault cases -- provided the operator notices the fault and responds to it rather quickly. Some RAID arrays will operate at a degraded level until a failed disk is substituted with a working disk. With the substitution accomplished, the operator can manually begin a rebuild process to integrate the new disk into the array, or in some cases, the system will start the rebuild automatically.
The primary challenge in most cases is addressing the disk failure in time; that means before a second or third drive fails in the same RAID set, which happens more often than you might think. That is because RAID drives, which are sequentially manufactured, tend to be deployed into the same RAID set at the same time. Some RAID techniques can rebuild a RAID set if one drive goes offline, while other proprietary approaches (not part of the original UC Berkeley RAID protocols) can withstand two drive failures.
But data storage managers can count on slow rebuild times associated with the manual or automatic recovery mechanisms of most RAID technologies. Recovering a RAID set incorporating 300 GB drives will likely take much less time than recovering the same RAID set using multi-terabyte drives because of the substantial amount of data that must be processed to replicate recovery data or to spread out recorded data to the new drive(s).
That said, many firms lack the knowledge and skills on site to accomplish a challenging RAID restore. Some IT planners prefer to outsource to a RAID recovery service for fear of making a minor mistake that compromises all of the data on the RAID set, including the data that has not been affected by the member disk failure or other error event. There are many services in the market that are available to do the job, but selecting the right one does require a bit of leg work.
Four tips to find the best RAID recovery service
From a technology standpoint, a RAID recovery service provider needs to have competency not only in basic RAID array levels or techniques, including the original five (or six, if you include RAID 0 or no RAID protection at all), but also in the additional levels such as RAID 5E, RAID 5EE, RAID 6, RAID 10, RAID 50, RAID 51, RAID 60 and RAID ADG. These RAID levels may leverage multiple interconnect and disk drive types and various Ethernet interconnects. Adding to the technical challenge are the variations on the RAID techniques introduced by server and storage system vendors and some media manufacturers.
- Find a RAID recovery service that supports your hardware. The bottom line is that you need to specify the characteristics of your damaged RAID set: make and manufacturer of components, firmware versions of RAID software, sizes and makes of drives, types of interconnects, operating systems used and potentially even the business apps writing data to the storage. Then you need to find a service provider who has some experience with the configuration you have.
- Understand RAID failure causes. Don't accept simple assurances that a vendor will sometimes make regarding the "four basic causes of RAID failures." While it is true that RAID system outages occur due to hardware RAID failure, software RAID failure, human error and application error, understanding this taxonomy of root causes does not indicate competency in recovering the data from the damaged RAID set.
- Look for certified RAID recovery service providers. It is usually a good sign if the service provider carries certifications in the basic practices of recovery services such as ISO 4 Class 10 Cleanroom Certification (governing particulate contamination levels in the facility where work is to be performed, if at the service provider's shop), SSAE 16 Type II compliance for secure processing of sensitive data assets, and other certifications that may pertain to the nature of your data and its legal/regulatory preservation and protection requirements. You should also look for certifications from the hardware vendors upon whose gear or software your RAID system is based.
- Spelled-out contracts are key. Look for a simple-to-understand contract that specifies what services will be provided and what timeframes are expected for results. You may prefer a diagnostic step at the outset to discover whether your data is recoverable at all. Preferably, this will be a free service since the actual recovery may carry a fairly steep price tag -- especially for a SAN or for certain systems combining RAID with content-addressable storage algorithms.
Erasure coding catches on as an alternative to RAID
Which RAID level is best for Exchange storage?
Can VMware VSAN provide RAID protection?