This article can also be found in the Premium Editorial Download "Storage magazine: Adding low-cost tiers to conserve storage costs."
Download it now to read this article plus other related content.
|Placing value on data|
Using disk more efficiently
The inexorable growth of corporate data has prompted many storage managers to move data to secondary storage to ease the strain on primary storage systems. The obvious benefit is that the more expensive primary storage is freed up to accommodate the applications requiring that class of storage. Reclaiming disk space that's being used for less-than-critical applications or to hold infrequently accessed data will help delay new disk purchases, or even avoid them altogether.
Nielsen Media Research of New York employs a sophisticated tiering system to allocate much of its vast amount of installed storage which, all told, adds up to about 1.2PB. Robert Stevenson, a technology strategist for Nielsen in its Oldsmar, FL, operations facility, says the company has three tiers of storage and is currently developing a fourth. The tiers are defined by price per gigabyte and matched to applications based on the level of service required. "Each of those tiers has different storage price ranges," says Stevenson.
The top tier costs users approximately $20 to $40 per gigabyte; tier two is priced at $15 to $30; and the third tier is $10 to $20 per gigabyte. "The application dictates the type of tier, but there is some blur," says Stevenson, "because tier one and two tend to be pretty similar in terms of throughput" until the number of host servers rises.
Relegating an application to secondary storage can be tricky, sending storage managers down a perilous path fraught with office politics because it requires setting a value on the application and its data. But placing all application data on primary disk poses an even greater risk of paying far more for storage than necessary. By putting the costs up front as Nielsen is doing, it shifts much of the burden of the which-disk decision to the business units that ultimately should be able to make the best decision based on the company's interests.
Another approach to matching data value to disk cost involves the use of archiving applications. Database archivers such as Princeton Softech's Archive and OuterBay's LiveArchive delve into databases and, using preset policies, identify data that can be moved to secondary storage.
Thinning out the data in databases stored on primary disk yields three main benefits:
- Primary disk space is freed, providing "growing room" for the application or other applications
- Purchases of new primary disk can be avoided or delayed
- The database applications should perform better
Woodard adds that Arkivio has been easy to use, and they had to do little more than tailor policies by choosing from auto-stor's options. And to ensure that the data is protected, no files are moved to secondary storage unless they've been previously backed up. "So far, it's been right on target with what I projected," says Woodard regarding the effectiveness of the archiving project.
Using data aging as the criterion for use of secondary storage assumes that data loses value over time and, therefore, won't be accessed as often. But old data--even if it's infrequently accessed--may still be a valuable reference resource for business units. (See "Placing value on data") Data warehouses are good examples of where data aging and access frequency may be poor criteria for migration to secondary storage. Guerin notes that Aetna's data warehousing application is "a highly parallel data warehouse environment where performance is critical." She says the company is not considering secondary storage because "less frequently accessed is hard to determine in those environments."
While the current crop of archiving applications adds standardization and automation to archiving to secondary storage, true automation tools are still to come. In the meantime, a solid understanding of the data that you're working with is the most important ingredient to an effective ILM/secondary storage environment.
This was first published in August 2004