There is an ongoing debate in the data storage industry about RAID and when it makes sense to replace it with erasure coding. Remember, RAID is actually a form of virtualization; it creates virtual storage volumes from a combination of physical disks connected by physical hardware and cable, and operates under a common management and protection scheme.
Erasure coding, often viewed as a RAID replacement, is simply another mechanism for spreading data over an assembly of disks. In some implementations, erasure coding systems forgo a common physical controller and use a clustered or distributed management scheme to write data to the disk drives. They also use a special coding methodology to deliver resiliency to the data.
If these are viewed as two ends of a spectrum -- centrally managed and protected virtual volumes versus distributed, self-healing disk collections -- then there are many systems that fall in between from an architectural standpoint. These options that can enhance RAID are rarely discussed within the context of RAID versus erasure coding debates.
RAID is delivered via software hosted either in an application server or on an array controller (which, these days, is essentially a server). The main problems with the current generation of RAID systems is that disk capacities have grown exponentially, imposing lengthier rebuild times should a disk in the RAID set fail. Aggravating the situation is that drives fail more frequently than vendors previously admitted -- in the time it takes to rebuild the RAID volume, a second or third drive may well fail, rendering all data on the volume unreadable.
How to use storage virtualization to enhance RAID
This vulnerability could be addressed by nesting RAID under another virtualization scheme that delivers, essentially, an array of arrays. IBM's SAN Volume Controller, as well as storage hypervisor products from numerous vendors, can accomplish this task and enable entire RAID arrays to be treated simply as discrete disks in a larger RAID scheme.
Alternatively, storage virtualization software can be used to create a virtual pool of capacity to which data can be stored, and where it can be exposed to numerous data protection techniques. Data stored to a DataCore Software SANsymphony-V storage pool, for example, can be protected with any or all of an assortment of data protection functions ranging from journaling (continuous data protection) to snapshotting to synchronous or asynchronous replication.
Hardware vendors are also seeking to improve the resiliency of storage with the pioneer in this space being X-IO. That vendor's Intelligent Storage Element array stores data differently to disk, using a unique drive-addressing scheme with a redundancy process built into data placement. Furthermore, the company's pedigree -- it was formerly Seagate's Advanced Storage Architecture division -- provides it with the inside track on Seagate's drive testing and remediation technologies, which enables drives exhibiting questionable status to be remediated in place or taken out of use within the array in a very efficient process.
The bottom line is that the vulnerability of traditional RAID should not be viewed necessarily as an urgent motivation for change. There are many ways to bolster or enhance RAID systems to improve the protection they afford to data before making a disruptive change to object storage systems with erasure coding.