Q: Can you please explain what content-addressed storage (CAS) is, and what it's used for?

A: CAS is a storage system that uses the content of the data as a locator for the information.

Typically, this is implemented by using a polynomial algorithm on the data to resolve to a unique ID. That unique ID--sometimes called a signature--is then associated with metadata as to the actual location of the data. Only the ID is then available to the application, or user if you will, to be able to access that data. One unique aspect of this approach is that if the exact same data is written again, it will resolve to the same identity. In that way, no duplicate data will actually be stored. CAS also prevents data from being changed, as any changes made to the data will result in a new ID.

CAS is really an early form of object-based storage. The concept isn't new. It's interesting to me because my graduate work at the University of Colorado was on content-addressable memory, where similar techniques were used for placement of data in random access memories. That was about thirty years ago.

--Randy Kerns, partner, The Evaluator Group Inc.

This was first published in November 2003

