Developing a strategy for data archiving - Storage Technology Magazine - Page 1

Developing a strategy for data archiving

One of the most challenging tasks facing storage managers is the development of a strategy for archiving data. Deciding what should be archived, when it should be archived and for how long goes to the core of the storage management process. You also have to understand the business value of data - perhaps more than you currently do. But when done properly, archiving can be a lifesaver to businesses requiring access to historic information for regulatory or audit purposes. Conversely, when it isn't done right, it can cost a company dearly in lost revenue, fines and other penalties.

To avoid these problems, you need a comprehensive strategy built around solid policies about data retention - something you'll need to develop with business managers. But there are also a host of factors directly under the control of storage managers: the tools you use, the formats you choose and the procedures to execute your strategy.


Tools that help you archive at the database level

    Requires Free Membership to View

    When you register for SearchStorage.com, you’ll also receive targeted emails from my team of award-winning editorial writers. Our goal is to keep you informed on the hottest topics, the latest news and the biggest challenges you face as a storage professional today.

    Rich Castagna, Editorial Director

    By submitting your registration information to SearchStorage.com you agree to receive email communications from TechTarget and TechTarget partners. We encourage you to read our Privacy Policy which contains important disclosures about how we collect and use your registration and other information. If you reside outside of the United States, by submitting this registration information you consent to having your personal data transferred to and processed in the United States. Your use of SearchStorage.com is governed by our Terms of Use. You may contact us at webmaster@TechTarget.com.


PRODUCT SUPPORTED DATABASES COMMENTS
Archive for Servers
Vendor: Princeton Softech Princeton, NJ
Oracle, DB2/UDB, SQL Server, Sybase, Informix Wide range of platforms supported, maintains referential integrity of archived data for quick restore
NetBackup Database Archiver
Vendor: Veritas Mountain View, CA
Oracle Stores records in XML format for easy retrieval in the future, integrates with Veritas NetBackup DataCenter

Archiving vs. backup
When many administrators hear the term archive, they think backup. That's where the trouble often begins.

"Sure, we do archiving," I was recently told by an IT manager. "Every quarter, we send full backups off for seven years," he stated confidently.

I asked him a few follow-up questions: How would he handle specific requests for three- or four-year-old data? What would the process be for retrieving it? This quickly left him feeling somewhat less confident.

One reason that the term "archive" is often misused is that many products that claim to do archiving provide different capabilities. At one end of the spectrum, there are a number of backup products that treat archiving as simply a backup followed by a deletion of the data from primary storage - a rather scary thought. This definition of archiving is really intended to assist in removal of old data cluttering up servers. A more effective approach to addressing this particular problem is through the use of storage resource management (SRM) or hierarchical storage management (HSM) tools.

So what is archiving, anyway?
A more useful definition of archiving is "the long-term storage of a point-in-time copy of information for a specific business purpose." This contrasts with backup in that backups are intended primarily to protect against short-term data loss, such as accidental deletion, device failure and data corruption.

Some strong candidates for archival data include periodic corporate financial information retained for auditing purposes, medical patient information retained for compliance with Health Insurance Portability and Accountability Act of 1996 (HIPAA) regulations, or data pertaining to clinical trials of a new drug wending its way through the FDA Drug Approval process.

The long-term nature of archived data presents a number of problems. Some may seem obvious, while others are less so. Here are some fundamental concerns:

  • Can the media format be read? How many of you still have QIC tape drives in-house? How about 9-track tape? Today, we have various tape formats and numerous generational variants within a given format. Tape drives typically can't read media older than a generation or two. For long-term retention, some thought must be given to maintaining devices for long-term recovery, or migrating data to newer media. This is further complicated in some regulated industries, where migration can raise validation and authentication issues.
  • Is the media still valid? The lifespan of magnetic tape media is dependent on a number of factors, but the bottom line is that if data is being maintained for a long time, steps must be taken to ensure long-term integrity. This includes maintaining proper environmental control, refreshing volumes as needed and similar tasks.
  • Can the data be utilized after it's restored? This goes to the heart of the matter. The data must be in a somewhat portable format, and not dependent on a now obsolete version of an application or operating platform. Old data might be dependent on a version of an application, an operating system and even the architecture of the processor in use when the data was stored.

This was first published in January 2003