Feature

Starting the ILM process

Ezine

This article can also be found in the Premium Editorial Download "Storage magazine: Storage salary survey: Are you being paid enough?."

Download it now to read this article plus other related content.

Data retention requirements

    Requires Free Membership to View

Different industries have different data retention requirements. Users need to make sure that the policies they set up for their information life cycle management (ILM) software retain their data long enough to comply with current requirements. They also need to weigh the pros and cons of keeping data around longer than what is legally required. While mining years- and decades-old data may provide interesting and useful information, it may also expose organizations to unneeded risk if there's no legal requirement to keep it around. Here are some examples of data-retention requirements in different industries:
Health care: The Health Insurance Portability and Accountability Act (HIPAA) regulation requires that health care organizations retain records (electronic and paper) for a minimum of six years. Records must also be retained for two years after a patient's death.
Financial: The Sarbanes-Oxley Act of 2002 mandates that accountants who audit or review financial statements of issuers must retain certain records for a period of five years after the end of the fiscal year in which the audit or review was concluded.

The Securities and Exchange Commission (SEC) requires that financial services firms store all e-mail traffic in its original form for at least three years and that they make those communications "accessible" for the first two years.

Banks and financial institutions in New York now have to keep ATM surveillance tapes for 45 days, instead of 30 days, to comply with New York state's recently strengthened ATM Safety Act./td>

Government contractors: Government contractors must keep track of their books, documents and accounting practices for one to four years. Regulations vary depending on the size of the company and the type of information. Firms with fewer than 150 employees or with contracts smaller than $150,000 only need to retain records such as employee information for one year, while larger firms need to track items such as paid, canceled and voided checks for up to four years.
Employee records: The Fair Standards Labor Act (FSLA) requires that employee records pertaining to payroll be kept for either two or three years, depending upon what type of payroll and earnings information is in question. The Occupational Safety and Health Administration (OSHA) requires that records about job-related injuries be kept on file for five years. OSHA also requires that records pertaining to medical exams that involve toxic substance and blood-borne pathogen exposure be retained for up to 30 years.

Acting on the information
Theresa O'Neil, director of storage strategy at IBM Tivoli, believes that users will look to better manage the data itself by getting rid of their nonessential data and tagging remaining data according to its content. "Organizations contain lots of processes," she says. "While it is important for them to retain data to support these processes, the greater issue becomes: can they retrieve the data needed to support these processes in a specified situation?"

O'Neil also says that while data may remain in existence for years, the media on which it resides on the back end will likely change. In response to this, IBM separates the management of the data and the media. They use their DB2 Content Manager to manage the content of data while their Tivoli Storage Manager product will manage the placement of their data on the media.

Steve Kenniston, a technology analyst with the Enterprise Storage Group, says ILM can be used to initiate conversations with business unit heads about their storage and recovery needs. "Every department head today believes that IT can recover anything anytime and this is just not true," he says, adding that "ILM is all about people, process and technology, not just technology."

Storage admins should look to capitalize on this new pool of knowledge, since they will now, for the first time, probably have hard facts to justify other storage management tools and technologies. These may include automated provisioning, fabric-based virtualization, deployment of new provisioning, fabric-based virtualization and the deployment of new protocols such as iSCSI or inexpensive storage such as ATA. So, while choosing a storage reporting tool may end up being more of a tactical than a strategic move, it will enable users to gather the data they need to make the more important, longer-term strategic decisions. One of those decisions will be the eventual deployment of an automated data management (ADM) solution.

Choosing the right ADM provider
Two important tasks will enter into the decision-making process when selecting an ADM provider. The first will be trying to understand and document the different types of data retention requirements within your company that this product will be responsible for managing. (See "Data retention requirements") The other will be picking a product that permits you to set up policies that manage data from its creation to disposition. Products vary in their ability to set up policies and carry out prescribed actions, such as deleting or migrating data in different environments, and so does their effectiveness at performing these tasks.

These functions become especially relevant in the different user environments that exist today. For users that only need to keep data for hours or days to support applications such as temporary batch files, probably any tool will work. Yet for some financial, government and human resource applications, data retention needs may span years, if not decades. As a result, you need to spend time determining whether your policy management abilities are aligned with your environment.

The situation becomes more complicated if you're using different vendors' products or even different products from the same vendor. You will need to ask how the different products hand off data management responsibilities. This becomes especially pertinent when using one vendor's tool to do the day-to-day storage administration and another vendor's tool to do backup and recovery. At some point, the tool doing the day-to-day storage administration will need to hand off the management of the data to the backup product, especially if it gets deleted and archived.

Here's where companies that offer a suite of products should be better positioned. Some of the major players such as Computer Associates (CA), EMC, Fujitsu Softek, HP, IBM and Veritas now possess many of the tools needed to succeed at ILM; others, such as AppIQ, CommVault Systems, CreekPath Systems, OuterBay, Princeton Softech and StorageTek, have only some of the needed components. One of the keys for each of these vendors will be how soon and how well they can cleanly integrate the different tools they own and show value to their users.

Another issue that may inhibit the deployment of ADM stems from the complexity of today's storage networks. EMC's Lewis notes that while ILM may be done now on an application-by-application basis, it still revolves primarily around manual processes and remains far too complicated for the enterprise. He says that technologies such as fabric-based virtualization, which EMC classifies as a data delivery service, will enable ADM to succeed in enterprise environments because it helps to solve some of the complexity of large storage networks.

Todd Rief, StorageTek's director of corporate strategy, calls fabric-based virtualization a "huge technology for ILM."

Fabric-based virtualization enables the creation of a fabric-based volume table of contents (VTOC). These VTOCs will play an important role in the future management of enterprise storage, emerging as a type of network-based storage routing table and functioning in much the same way as Cisco's routing tables, which manage today's data networks. Also, because all servers can theoretically get their storage from this virtualization layer, it creates a new programming layer in the network that will enable the development and deployment of tools such as ASM without costing a fortune.

Will HSM emerge?
One component of ILM, hierarchical storage management (HSM), involves putting the right data on the right level of storage and removing it as it ages. This technology, long available in mainframe 0S/390 environments but considered by some to be a failure in the open systems environment, remains a difficult technology to implement. But through the combination of the aforementioned technologies, HSM may start to become a reality for the enterprise.

CommVault Systems' Van Wagoner says the crux of the problem with HSM is establishing the value of the data. For ILM to work, the software must be able to look at something as simple as a Word document and determine whether that document contains your kid's soccer schedule or a vital business contract. One file you may want to delete and the other you may want to archive.

While technologies such as fabric-based virtualization and ADM are still maturing, the generally available storage reporting and SAN infrastructure tools look ready for prime time. So now is the time for organizations to start to deploy these first-line tools and experiment with some of the next-generation technologies. As the momentum for ILM continues to build, users would be well-advised to start the process of ILM by putting the initial building blocks in place.

This was first published in December 2003

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: