Storage.com

How to create a successful data archiving strategy

By Phil Goodwin

What you will learn: The technical tools and processes required for an effective data archiving strategy depend entirely on a company's compliance, data governance and storage management requirements.

There's a story that tells how someone once asked Abraham Lincoln how long a man's legs should be. Our 16th president reportedly replied, "Long enough to reach the ground." Similarly, when it comes to the question of how long data should be archived, the reply might be, "Long enough to be sure that it's available when you need it." This statement captures the two most critical variables of the data archive equation: time and accessibility.

Time, or more accurately the retention period, is the "tip of the spear" when it comes to matching an organization's needs with potential archiving solutions. Data retention requirements can be highly variable, often determined on an application-by-application basis. For example, all organizations must manage financial data, which generally must be retained for seven years. Human resources data may need to be retained for three years, but that regulation can vary by state. Medical data might be retained for the life of the patient plus seven years, nuclear power data for 70 years and so on.

There's a simple answer to the question of what all these time periods have in common: compliance. In most cases, the retention requirement matches the statute of limitations for a party (either governmental or private) to bring legal action against the organization. Failing to produce records demanded by a court order can lead to civil and, in some cases, criminal penalties. On the flip side, retaining records beyond the mandated period makes them subject to legal discovery and needlessly jeopardizes the organization's legal position.

Unfortunately (or perhaps fortunately), most IT people have no legal background. So, step one in developing a data archiving strategy is to inventory the data and assign a retention schedule to it. Corporate counsel may be able to provide the necessary parameters. If the attorneys can't (and you'd be surprised how often they decline to do so), the heads of the individual departments that "own" the data might be able to supply the retention information, as they should be familiar with the regulatory environment of their area. Sometimes, attorneys and department leaders don't want to chisel a time frame in stone. In that case, IT organizations shouldn't guess. In the absence of a specific time frame, the default retention period becomes "forever." While not optimal, it may be the only option for IT managers.

The term archive has been used in a rather fast and loose manner over the past several years. Archiving can refer to moving infrequently accessed data to high-capacity, low-cost disk (including tiered storage), backup to tape and offline/off-site storage. Similar to having a continuum of data protection (i.e., a mix of snapshots, replication and backup), organizations will have a data archiving continuum. This continuum will be necessary to meet the varying time frames mentioned above at a cost-effective price. Satisfying these varying needs will be balanced against complexity, and a good archiving solution will provide the automation needed to deliver the necessary application granularity while minimizing the impact to IT operations.

Data archiving benefits

IT organizations will be motivated to implement archiving as a general-purpose enhancement or for application-specific reasons. In either case, expected benefits of archiving include:

Application-specific archiving products are tailored to deliver these benefits to specific environments. Examples include SAP, email and Oracle applications. Application-specific products are designed to know the ins and outs of the application so they can prune or separate data in a manner that optimizes the application without endangering referential integrity. General-purpose archivers aren't usually smart enough to do this. An application-specific tool may be all that's needed when data volumes don't justify a system-wide implementation, the major pain point relates to a specific application or a general-purpose product won't adequately address a given application.

About the author:
Phil Goodwin is a storage consultant and freelance writer.

14 Jan 2014

All Rights Reserved, Copyright 2000 - 2024, TechTarget | Read our Privacy Statement