In the mainframe's heyday, disk was expensive, prompting systems vendors to use hashing algorithms to trim down their data stores. By transforming a string of characters into a shorter fixed-length value that represents the original string, hashing can ensure that a character string is only stored once.
These days, storage is cheap, but data is plentiful, so storage vendors have once again turned to hashing to keep data capacities under control.
The best-known example of this trend is EMC's archive solution, Centera, but several innovative startups have also resurrected the hash.
Avamar uses a hash function to reduce the amount of data it stores in its Axion backup and recovery arrays, while Permabit uses it as the foundation of a software-based compliance repository.
But Marc Duvoisin, national director of enterprise servers and storage for Dimension Data, in Reston, VA, thinks hashing's real promise lies in remote office consolidation. "Networking has gotten cheaper, but not that cheap," he says. And as of yet, no one has solved the problem of cost-effectively transferring data between central and remote offices.