In searching for RAID alternatives, today's data storage pros will find that familiar technologies such as erasure...
coding are back in vogue.
Before RAID, drive-level data protection was already provided in some storage arrays. It used a set of technologies collectively referred to as erasure coding. Without delving too deeply into the math, the main idea of erasure coding is to enable the reconstruction of data that becomes corrupted at some point in the disk storage process by using information about the data stored elsewhere in the array. Redundancy was provided via erasure codes in a way that did not require a complete replica of the data itself: a strategy that provided both data protection and cost-savings over simple mirroring schemes.
Erasure coding storage systems deliver resiliency with less bandwidth and storage than is the case with conventional RAID arrays.
One current technology, BitSpread from Amplidata, provides a fairly straightforward example of how erasure coding could be implemented in enterprise storage (and is already implemented in Amplidata systems and the object cloud services that use them). With BitSpread, a piece of data that is to be recorded to a storage device is converted into an object, which is then subjected to an algorithmic process to create object "fragments." Any two fragments, when multiplied by each other, recreate the original object.
Via this process, Amplidata claims it saves approximately 50% of the capacity that would otherwise be required to fully mirror a file or object to another location on the disk array. The vendor further claims it could use these object fragments, which are spread across 16 drives, to rebuild all data in the drive set -- even if four drives in the set fail.
Advocates claim that object storage technologies leveraging such erasure coding techniques from Amplidata, Cleversafe and a few others, will eventually replace RAID and usher in a new generation of low-cost, high-capacity object storage. However, many people often overlook that parity schemes inherent in some RAID schemes (such as RAID 5 and 6) can also be included in a generic usage of the term erasure coding.
The key feature of erasure coding is that erasure coding storage systems deliver resiliency with less bandwidth and storage than is the case with conventional RAID arrays. (Erasure coding storage is called distributed and self-healing, rather than centrally managed and protected, but it provides similar protection when compared to RAID systems.)
Few storage planners will implement erasure coding themselves, but they may implement early products from vendors embracing the technology, such as DataDirect Networks, EMC and a few others. Of course, many cloud storage providers are beginning to leverage RAID alternatives to facilitate the development of massive storage capacity with good resiliency at a lower cost than traditional RAID.
The gating factor on adoption of the technology continues to be its large education requirement. Customers aren't as conversant in the rarified domain of object storage, which partially accounts for why Dell dropped out of the chase to deliver object storage solutions in 2013 and why NetApp and IBM have done little to pursue the technology on their products. Still, erasure coding remains a hot button in the analyst realm and a centerpiece of emerging storage products from lesser-known storage vendors.