Content-addressed storage (CAS) is a method of providing fast access to fixed content (data that is not expected to be updated) by assigning it a permanent place on disk. CAS makes data retrieval straightforward by storing it in such a way that an object cannot be duplicated or modified once it has been stored; thus, its location is unambiguous. The term was coined by EMC Corporation when it released its Centera product in 2002.
When an object is stored in CAS, the object is given a name that uniquely identifies it, and that also specifies the storage location. This type of address is called a "content address." It eliminates the need for an centralized index, so it is not necessary to track the location of stored data. Once an object has been stored, it cannot be deleted until the specified retention period has expired. In CAS, data is stored on disk, rather than on tape. This streamlines the process of searching for stored objects. A backup copy of every object is stored to enhance reliability and to minimize the risk of catastrophic data loss. In the event of a hardware failure, the system administrator is notified by a remote monitoring system.
A significant advantage of CAS is the fact that it minimizes the storage space consumed by data backups and archives, preventing what some engineers call a "data tsunami" (the overwhelming buildup of information, much of which is obsolete, redundant, or unnecessary). Another advantage is authentication. Because there is only one copy of an object, verifying its legitimacy is a simple process.