This content is part of the Essential Guide: Building a better archival storage strategy

Data archiving system selection dependent on features, capabilities

Once you've determined the specific data archiving needs of your organization, you'll have to sift through existing products to find the best fit.

What you will learn: A data archiving system can help meet regulatory or storage management requirements, but you need to understand archiving tools' features and capabilities when evaluating products.

Regulatory compliance, data governance and storage management are all factors that must be considered when selecting and configuring a data archiving system. Requirements will differ from industry to industry, and from one company to another, but once you've determined the specific data archiving needs of your organization, you'll then have to sift through the available products to find the best fit. Here are some key capabilities and features of data archiving systems that you'll need to consider during your product evaluations.

Architectural and policy definitions

Selecting an archiving product begins with understanding the necessary logical architecture. Architecture is driven by service-level agreement (SLA) definitions. Once SLAs are determined, IT architects can determine what architecture and devices are needed to meet the SLAs. Think of architecture as a list of Lego blocks needed to build a specific structure. You may need some big red ones, thin yellow ones, long blue ones and so on. It doesn't, however, specify how they're assembled for a specific situation. Archive SLAs include such things as retention period and recovery time objective. Recovery point objective should not really come into the picture. Step two is defining the archive SLAs. Armed with this information, IT decision makers can objectively determine which products will qualify as potential building blocks in their architecture.

How the architectural devices are assembled and configured is driven by organizational policies, which are derived from retention and data destruction terms. At this highest level, this describes what needs to be moved where, when and how. Going back to our Lego analogy, the answers to these questions indicate how many of which blocks need to be assembled in a particular way. This information helps buyers determine the exact solution needed, verify that vendors can meet stated requirements, and supply hard data for price quotes and apples-to-apples comparisons. Step three is defining precisely how archiving is to be implemented.

Desirable functionality

Earlier guidance for determining "what needs to be moved where, when and how" is in an early tip regarding desirable functionality in archiving products. At the heart of any archiving product is the policy engine, which serves to select what needs to be moved when and how. Step four for the IT decision maker is to understand a product's policy engine and match its capabilities to requirements.

Four steps in evaluating an archive product

Step 1: Inventory your applications and assign retention and destruction policies.

Step 2: Define your archive service-level requirements.

Step 3: Define your policies regarding "what needs to be moved when, where and how."

Step 4: Understand the prospective vendors' policy engines and match their capabilities to your requirements.

Policy engines drive information discovery, classification, and some form of indexing and data movement. Some products have highly preconfigured policy engines that simplify deployment and maintenance. Users who prefer out-of-the-box simplicity may find these products attractive. Other products may require a considerable amount of setup and configuration to establish the governing policies and behavior. Product vendors may offer professional service assistance to speed deployment and create an optimal experience. Users who prefer a customized solution or have unique needs -- and who are willing to pay for the additional services -- will gravitate toward these products.

One aspect of data classification and movement that can't be ignored is data destruction. This is, obviously, the end of the lifecycle. When and how data is destroyed is an important aspect of archiving. The important "how" of data destruction isn't the mechanical aspect of erasing data; it's in knowing where all the data copies reside so they can be deleted. From a legal perspective, if nine copies of data have been removed but a tenth remains, it's still subject to court-ordered discovery.

Other worthwhile functionality in archive products includes deduplication, compression, encryption and write once, read many (WORM) capabilities. When reducing costs is the goal of archiving, reducing the amount of data through deduplication and compression are among the most effective ways of doing so. Applying encryption is a best practice anytime data goes outside the data center, and an increasing number of organizations are expanding the use of encryption even inside the data center as a matter of policy. So much data is now subject to regulatory scrutiny that it simply isn't worth trying to separate what is and what should not be scrutinized so it can be treated differently. WORM technology is essential if compliance regulations demand that data be verifiably unchangeable.


Because backup and recovery (B/R) products are used to move data to tape archives, it may be highly desirable to have an archive product's metadata catalog and the B/R catalog integrated. Without this integration, administrators will need to ensure that data is managed and destroyed in both products and that the policy engines in each are kept synchronized. Such a scenario is not only laborious, but fraught with the possibility of human error.

Organizations that have multiple areas of need may decide to deploy more than one archive product. For example, one solution may be chosen for SAP and another for Exchange. This best-of-breed strategy increases the amount of training and requisite administration for the staff, but it may be worth the functional benefits, especially for a decentralized administrative team. Other organizations may prefer a single, enterprise-wide solution to simplify implementation and administration.


Beyond mere functionality, IT buyers will need to consider the deployment platform of the archive solutions. In most cases, archive products are software applications, but some include both hardware and software. This is especially true for WORM archive solutions that depend on hardware functionality to ensure data immutability.

In addition, IT buyers can consider cloud-based archive solutions to simply outsource the administration and hopefully get an economy of scale that can't be achieved in-house. In a cloud implementation, however, setup of the policy engine based on service-level requirements is as important as it would be if kept in-house. Vendors will offer varying degrees of service to assist in this effort, and that service may be one of the most important decision factors. Even though it's easy to get into a cloud environment, it may not be easy to get out due to the logistics of moving tens or hundreds of terabytes of data. Regardless of platform, an informed consumer is a smart consumer, and the result will be data that is retained just long enough.

About the author:
Phil Goodwin is a storage consultant and freelance writer.

Dig Deeper on Data storage compliance and regulations