This article can also be found in the Premium Editorial Download "Storage magazine: Adding low-cost tiers to conserve storage costs."

Download it now to read this article plus other related content.

Placing value on data

Requires Free Membership to View

The key to any data archiving plan is to understand the nature of the data, so that when data is moved from one class of storage to another, the level of service will still be acceptable. The data itself will dictate how it may be archived, but some criteria to keep in mind include:

Application. The type of application will bear strongly on how its data may be archived; critical applications, such as CRM, may require maintaining data on higher performing primary storage.
Frequency of use. Generally, older data is infrequently accessed and is less important.
Criticality to operations. Even if the data isn't associated with a key corporate system, its availability may still be critical to certain business applications.
Expiration dates. Some data fades away naturally; other data may need to be retained for certain periods of time.
Regulations. Legal regulations may dictate what data must be retained on storage systems that provide fast and easy access.

Using disk more efficiently
The inexorable growth of corporate data has prompted many storage managers to move data to secondary storage to ease the strain on primary storage systems. The obvious benefit is that the more expensive primary storage is freed up to accommodate the applications requiring that class of storage. Reclaiming disk space that's being used for less-than-critical applications or to hold infrequently accessed data will help delay new disk purchases, or even avoid them altogether.

Nielsen Media Research of New York employs a sophisticated tiering system to allocate much of its vast amount of installed storage which, all told, adds up to about 1.2PB. Robert Stevenson, a technology strategist for Nielsen in its Oldsmar, FL, operations facility, says the company has three tiers of storage and is currently developing a fourth. The tiers are defined by price per gigabyte and matched to applications based on the level of service required. "Each of those tiers has different storage price ranges," says Stevenson.

The top tier costs users approximately $20 to $40 per gigabyte; tier two is priced at $15 to $30; and the third tier is $10 to $20 per gigabyte. "The application dictates the type of tier, but there is some blur," says Stevenson, "because tier one and two tend to be pretty similar in terms of throughput" until the number of host servers rises.

Relegating an application to secondary storage can be tricky, sending storage managers down a perilous path fraught with office politics because it requires setting a value on the application and its data. But placing all application data on primary disk poses an even greater risk of paying far more for storage than necessary. By putting the costs up front as Nielsen is doing, it shifts much of the burden of the which-disk decision to the business units that ultimately should be able to make the best decision based on the company's interests.

Another approach to matching data value to disk cost involves the use of archiving applications. Database archivers such as Princeton Softech's Archive and OuterBay's LiveArchive delve into databases and, using preset policies, identify data that can be moved to secondary storage.

Thinning out the data in databases stored on primary disk yields three main benefits:

  • Primary disk space is freed, providing "growing room" for the application or other applications
  • Purchases of new primary disk can be avoided or delayed
  • The database applications should perform better
The key factor is the assumption that the less frequently used data can be served adequately from lower performing disk. A large Pacific Northwest insurance and investment firm that's using data aging in their tiered environment recently added a Nexsan ATAboy array for secondary storage. "The Nexsan box is used solely as a repository for archived or infrequently accessed files," says Jeff Woodard, a storage architect/designer for the insurance company. The firm is using Arkivio Inc.'s auto-stor, a storage utilization and data management application, to move user files to the Nexsan device. "It's unstructured data; it's pretty much everything you would expect users to be putting in network shares--office files, dot-xls, dot-docs, pdf files."

Woodard adds that Arkivio has been easy to use, and they had to do little more than tailor policies by choosing from auto-stor's options. And to ensure that the data is protected, no files are moved to secondary storage unless they've been previously backed up. "So far, it's been right on target with what I projected," says Woodard regarding the effectiveness of the archiving project.

Using data aging as the criterion for use of secondary storage assumes that data loses value over time and, therefore, won't be accessed as often. But old data--even if it's infrequently accessed--may still be a valuable reference resource for business units. (See "Placing value on data") Data warehouses are good examples of where data aging and access frequency may be poor criteria for migration to secondary storage. Guerin notes that Aetna's data warehousing application is "a highly parallel data warehouse environment where performance is critical." She says the company is not considering secondary storage because "less frequently accessed is hard to determine in those environments."

While the current crop of archiving applications adds standardization and automation to archiving to secondary storage, true automation tools are still to come. In the meantime, a solid understanding of the data that you're working with is the most important ingredient to an effective ILM/secondary storage environment.

This was first published in August 2004

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: