The drive to better align data storage policies to business needs is spurring the development of new storage management tools. Two goals are improved efficiency of assets through better utilization and greater effectiveness of people through automation. These goals converge in a renewed effort underway by several vendors to reintroduce a technology that's been around for many years--hierarchical storage management (HSM).
This time-honored mainframe technology (see "When disk wasn't cheap") never caught on in the Windows and Unix worlds. But changes in technology suggest you should take another look at HSM. And if you're thinking of implementing HSM, consider how to integrate HSM into your open systems environments.
A solution in search of a problem? Several factors have combined to limit the adoption of HSM in open systems, notably:
- Dramatically falling prices for disk storage
- Distributed nature of open systems environments
- Fundamental characteristics of open system apps
This practice resulted in an increase in overall capacity, while utilization rates fell. Although this was wasteful, as long as the emphasis was on the cost of acquisition rather than the cost of management, it seemed reasonable. However, as administrative costs have begun to outstrip hardware costs, IT managers have renewed their interest in any option that can mitigate these costs.
The decentralized nature of open systems also discouraged the adoption of HSM. In the days of non-centralized, direct-attached storage (DAS), the opportunity to reallocate excess storage didn't exist. It wasn't worth the effort to recoup space from a particular system because there wasn't a way to effectively reassign it. With the onset of storage networks, reallocating storage has become more practical, so this objection no longer applies in many environments.
Nor does the issue of open systems' more interactive (vs. mainframe's) character. Open-systems applications tend to be highly interactive, whereas mainframes--to a large extent--perform huge quantities of batch processing. And the interactive nature of open-systems apps doesn't work well with HSM. If you directly applied the mainframe approach to open systems, when users attempted to access a document, for example, they'd be faced with an hourglass icon for several minutes. The potential increase in help desk calls alone is enough to discourage adoption of HSM. That problem is particularly apparent when tape is the target media for HSM data.
HSM in today's environment
So, is there a place for HSM? The answer is a qualified yes. Some reasons to consider HSM today are:
- To reduce costs of storage management
- To improve backup/restore performance
- To improve management of large e-mail repositories and other databases
|When disk wasn't cheap|
HSM also dramatically improves backup operations. In traditional backup environments, full backups are performed on a regularly. Studies have shown that a large percentage of files on file servers are rarely accessed after a few months, yet these files continue to impact the time required to perform backups and the amount of media consumed.
With HSM, these files would be migrated to tape with stubs--or fingerprints--left on primary storage, greatly reducing the size of the primary data stores. They would no longer be constantly backed up, thereby improving backup and recovery times and reducing tape consumption.
Similarly, a problem plaguing many environments today is the growth of e-mail and databases. Several vendors offer HSM-related products specifically designed for use with applications such as Microsoft Exchange or Oracle that enable the migration of old messages, attachments and infrequently accessed records to other media (see "Application-focused HSM/HSM-related products"). The promised result is a reduction in the size of the primary repositories.
Integrating an HSM solution into a storage management framework shouldn't be approached without evaluating the impact on the rest of the organization. You should consider four main questions:
How well do you know your data? There needs to be a solid understanding of the data being managed in order to establish appropriate policies that correctly align with the value of the data at risk. Simply determining that a file hasn't been accessed for a certain period of time isn't sufficient to make it a candidate for migration. A clearly defined data classification methodology with broad support within the organization is one requirement for a successful HSM implementation.
What's the impact on users and applications? The impact of delays in accessing data needs to be understood before deploying HSM. Are delays acceptable? Can they be mitigated with near-line storage? Can your applications handle them appropriately?
How does HSM impact backup and other storage operations? It's important to understand what operational changes will be required to accommodate HSM. Also, where will HSM software or agents need to be deployed, and what's the impact?
How can I back out the HSM solution from the environment, if necessary? If HSM turns out to not be the right solution or there's a desire to change vendors, it's important to understand the level of effort and impact to return the environment to its non-HSM state.
With the evolution of storage networks, low-cost disk, and enhanced software offerings, HSM is worth another look. Application-focused HSM solutions, in particular, have the potential to provide some unique benefits. The success of these solutions depends on a clear understanding of requirements, benefits and risks.