Feature

Lock up data with fixed-content storage

Ezine

This article can also be found in the Premium Editorial Download "Storage magazine: What to do when storage capacity keeps growing."

Download it now to read this article plus other related content.

Capacity management
There are three primary ways CAS products manage and reduce the amount of data they store: object-based storage, SIS and data deduplication.

CAS vendors that support RAIN and networked storage array architectures store files by saving them as objects. Incoming files are scanned and a hashing algorithm creates a unique identifier for that file, which is stored in the CAS product meta data database used to reference and access that object in the future. This technique, called SIS, reduces the amount of storage. When a file is submitted to the CAS product for storage, the hashing algorithm used to analyze a file will always create the same unique identifier for the file even if some of the file attributes are different. This lets users save storage space because they're not storing multiple instances of the same file.

Before implementing SIS, users need to consider the time it takes the CAS product to generate the unique identifier and check its meta data database to see if that identifier already exists. Searching for a unique identifier may be done quickly during initial deployment, but as the size of the meta data database grows it takes longer to search it.

For the fastest file storage and recovery possible, users should use the latest version of the RAIN OS. EMC, for example, claims that under certain conditions the latest version of Centera's CentraStar OS performs four to five times faster than

Requires Free Membership to View

earlier releases. Another option is to upgrade hardware nodes with faster CPUs and 1 Gigabit Ethernet ports rather than the 100Mb ports common to first-generation nodes. Upgrading shouldn't be that painful because RAIN nodes may be taken offline and replaced nondisruptively, and different generations of nodes can operate in the same cluster.

Another factor to consider before turning on SIS is the type of file being archived. For certain types of files, such as check images, nothing will be gained by turning on SIS. Conversely, users will see significant savings using SIS when storing e-mail attachments, for example.

This was first published in June 2006

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: