Manage Learn to apply best practices and optimize your operations.

How to write an archiving program RFP

With so many archiving systems on the market, putting together a request for proposal (RFP) for an archiving program for structured, semistructured or unstructured data is a key step. It's equally important that your team is well-prepared to evaluate vendor proposals so you'll end up with a product that fits your company's needs at a price that doesn't break your budget.

All archiving programs aren't alike. Here's how to craft an RFP that lets you compare programs on an equal footing...

and find the application that best fits your environment.

Maybe your boss said he wanted one. Or the compliance officer told you the company needed an archiving system to protect itself. With so many archiving systems on the market, putting together a request for proposal (RFP) for an archiving program for structured (database), semistructured (email) or unstructured (files and documents) data is a key step. It's equally important that your team is well-prepared to evaluate vendor RFPs so you end up with a product that fits your company's needs at a price that doesn't break your budget.

Archiving products are different from backup programs, but some archiving products work closely with backup apps (see "Archives aren't backups," below). In some cases, data is archived to comply with government or industry regulations, such as the Sarbanes-Oxley Act (SOX) of 2002 or the Health Insurance Portability and Accountability Act (HIPAA). In addition, last December the U.S. Court made changes to its Federal Rules of Civil Procedure to add more details about how civil cases should consider electronic documents such as email. The bottom line is that storage administrators need to have a much better handle on what data is available and where it's stored.



Archives aren't backups

You may think a backup and an archive are one and the same, but they're not. Here's how they differ:

  • Archives provide tools to find data from a particular point in time--backups don't.
  • Users restore data from backups, but retrieve data, such as for compliance purposes, from archives.
  • Archived data isn't expected to change or be used on a day-to-day basis.
  • Archive systems are more content-aware than backups.
  • Archive systems sometimes have a mechanism for data expiration.


An RFP is a formal document explaining the type of product and features you need. Vendors use the RFP as a guide to submit suggestions about how their products might meet your needs. But you don't necessarily have to go through a formal RFP process to find the right product. For example, the KnowledgeWorks Foundation, which seeks to improve education in Ohio, opted for a Logicalis Inc. email archiving product because of the company's relationship with EMC Corp., which supplies storage hardware to KnowledgeWorks.

"Our needs weren't super-technical as far as needing a lot of different requirements that are policy related," says Matthew Barcus, senior manager, technology and Web services at KnowledgeWorks. As a nonprofit, KnowledgeWorks didn't have to worry about SOX compliance, although Barcus says the firm does its best to follow major regulations.

A more rigorous process is often required to make it easier to compare different vendors' offerings and ensure that the chosen product solves the problem. In addition, a more objective process should make it clear why certain vendors were eliminated and better define what's included in the deal with the winning vendor.

"RFPs ensure that you actually get real answers to real questions," says Dick Benton, principal consultant at GlassHouse Technologies Inc., Framingham, MA, and that you--not the vendor--are driving the procurement process. While it might seem practical to ask your storage vendor to suggest an archiving application that's compliant with an installed storage system, it's more important to make sure the archiving software meets the business needs of the company, advises Benton.

"The best practice, in all cases, is to pick the software and then pick the hardware," agrees Carolyn DiCenzo, a research VP at Gartner Inc. "Define your architecture and then find the pieces that make it work together."

Business needs
There are several business needs to keep in mind, according to Benton. First, determine what part of the company--legal, financial, sales or another group--will drive the process. Each group may have different retention and data immutability needs. If groups have differing requirements, they'll have to be reconciled as well.

Benton recommends that the IT department not take charge of the process. "God help the CIO who decides on the electronic retention policy without risk, legal or compliance management's input," he says. "Legal and compliance policies need to be clearly established and put down in writing."

If the company is buying archiving software to satisfy compliance laws, there may be other requirements as well, says Greg Schulz, founder and senior analyst at the StorageIO Group, Stillwater, MN. For example, you'll want to look at features associated with searching, such as classification, data indexing and support for advanced search capabilities.

You may also require "chain of custody" tracking that documents which users have touched or manipulated the data, as well as "litigation hold," which is the ability to put data associated with a legal matter into a special archive where it can't be changed or deleted. There may also be legal reasons why certain data must be destroyed after a specified time period.

Other factors to consider are whether data, particularly email, is captured continuously or on a scheduled basis, and whether the archive will be encrypted, which raises the question of key management. "When that email is retrieved in some sort of legal search, who has the password to decrypt it?" asks GlassHouse Technologies' Benton. "Probably Fred, who's long since left the company," he adds.

It's also important to consider the administrative impact of products, says Benton. Products that require a great deal of human intervention might be cheaper to start with, but become more expensive over time as maintenance tasks and staff salaries increase. Accessing and reading archived data in the future may also be a key issue (see "50 years from now," below).



50 years from now

The chances are good that buried in the back of the stereo cabinet, you have eight-track tapes, minidisks, Beta tapes, laserdiscs or CD singles that require an adapter you no longer have. And you probably copied all of your records to cassettes, and then later copied the cassettes to CDs.

How do you make sure your archiving solution doesn't go the same way, joining the pile of zip disks and nine-track tapes gathering dust somewhere? And even if the media stays current, how do you make sure you can still read it?

The 100 Year Archive Task Force, organized by the Storage Networking Industry Association (SNIA), is looking at those problems and is planning to release some results next year, says Michael Peterson, chief strategy advocate for the SNIA Data Management Forum, and president of Strategic Research Corp., a consultancy in Santa Barbara, CA.

"There are two big Holy Grail problems," says Peterson. On the physical side, the media, as well as the software, operating system and drive platforms, all age, requiring a migration every three years to five years to guarantee readable data. On the logical side, the information has to be able to be interpreted--a problem that remains unsolved. Even the new archive ISO standard for PDFs says users shouldn't count on being able to read it in the long term, he says.

SNIA hopes to help the industry develop a self-healing system that eliminates the need for physical migration, as well as develop the eXtensible Access Method (XAM) protocol to help vendors make storage platforms that are independent of user applications, he says.

SNIA's solution requires support from application vendors who so far haven't made it a high priority, says Jim Damoulakis, CTO at GlassHouse Technologies Inc., Framingham, MA.


Feature checklist
Once the business drivers are settled, the specific features of each product can be evaluated. Kathryn Hilton, a senior analyst for policy at Mountain View, CA-based Contoural Inc., has published an ebook on archival software that lists the following as major feature classes:

  • Capture
  • Architecture
  • Classification
  • Retention management/disposition
  • Hold management and litigation support
  • Index
  • Search/retrieval
  • Reporting/audit/supervision
  • User interface
  • Administration
  • Storage management (including "single-instance store")
  • Security

Needless to say, an email archiving app has to work well with your email system, adds StorageIO Group's Schulz.

Vendor reliability is another factor, which doesn't necessarily mean bigger is better, says Gartner's DiCenzo. "Will IBM [Corp.] be around in 10 years? Yes, but will they still commit to the program you're working with?" Part of the risk factor is the retention period; users can take more risks with a shorter period of time than with a longer one, she says.

Large vendors might also offer several uncoordinated archiving solutions, says Schulz. "You mention 'Who has archiving?' and IBM puts up five hands, EMC puts up five and Hewlett-Packard [Co.] puts up three or four," he says. "They all have different pieces."

Implementing the process
Small vendors can be more flexible with pricing, says Chris Formes, IT manager at Brookfield Homes Southland Inc. in Costa Mesa, CA. Formes was looking for an archiving product to help him consolidate email. His 150 Exchange 2003 users had mailboxes ranging in size from 200MB to 9GB. Some inboxes had 7,000 messages, he says. Exchange performance was down to a crawl, particularly for searches, and it took Formes 40 hours to 45 hours to do a backup; as a result, email could only be backed up on weekends.

Last July, Formes began the archiving RFP process with his goals: reduce the size of the information store, shorten the time for backup and restore, and cut out the need for an administrator in the restore process due to his limited staff. He asked a set of trusted vendors for their recommendations and ended up with two finalists, NearPoint from Mimosa Systems Inc. and EVault InfoStage ArcWare from EVault, a Seagate company. Based on his criteria, he decided to do a proof-of-concept test with Mimosa.

Formes built a test environment with an Exchange server using 5GB to 10GB of data, and had Mimosa staff perform data extractions based on policies he specified. In the process, Formes learned more about his requirements. "I found some products put a heavy load on the Exchange server itself," he says. "Finding out what that load was, was pretty important to me." The test was conducted over two months, and Formes bought the Mimosa product and had it up and running a few months later, he says.

Even with that much planning, Formes found the archive takes more storage than he thought--almost a terabyte--when his mail storage is 200GB. Because he has two offline backup copies, he expected only three times larger storage requirements and he isn't sure why it's so much bigger. "It's just something you need to plan for," he says.



Major compliance regulations

Compliance regulations are behind a major uptick in archiving product sales. Some of the more widely known regulations in the U.S. include the following:

The Sarbanes-Oxley Act (SOX) Act of 2002 established new and enhanced accounting standards for public U.S. companies. They not only require companies to produce information, but ensure that archived information isn't changed.

The Health Insurance Portability and Accountability Act (HIPAA) was enacted by Congress in 1996; among other things, it improves the security and privacy of health data. It requires patient records to be saved during the life of the patient.

The Family Educational Rights and Privacy Act (FERPA) of 1974 specifies similar privacy rights for student educational records at all schools that receive funds from the U.S. Department of Education.

The Federal Rules of Civil Procedure (FRCP), a new version of which went into effect in December 2006, requires that participants in a civil case reveal retention practices and electronic formats of data. In addition, some states have their own laws on electronic evidence, which may or may not be the same as the federal laws.


Rorie McBride, technical support manager at British Telecom in Belfast, went through a process similar to that of Formes. McBride was looking at archiving products for a managed services customer he can't name that has 14,000 mailboxes and an archive load of 8.5 million messages a month, which is up from 6 million when he started.

McBride's primary concern was to find a product that could deal with that volume. His second concern was training all those users on a new application. Third, he wanted to make sure the application could scale to the size of the growing archive and perform indexing and backup.

Part of the reason the company was archiving was for compliance. In addition, internal audit procedures required that email be produced when needed. "The way you have to approach that, when you don't have archiving, is [by] using your backup as an archive, which is virtually impossible and time-consuming," says McBride. While the company had been doing just that for three years, it was no longer possible with the current volume, he adds.

British Telecom received evaluation versions of several packages for testing. "My team basically thrashed them for a month," says McBride. "All of them had their plusses and minuses, but when it came down to it, we were familiar with the way [the] CommVault [product] worked" because that's what the company uses for backup. This helped ease the training issue. "Ultimately, what it came down to was price and ease of use," he says.

Database and file archiving
It can be complicated to archive database records, says StorageIO's Schulz. "Are you archiving the whole table, or extracting rows and mapping?" For example, to preserve the context of a single transaction, you may need to pull rows out of several databases. Restoring data back into the database can be problematic as well, he adds.

In the case of file archiving, the issue is file systems; for example, a file archiving system using CIFS can't be used on an OS using NFS, and vice versa, says Gartner's DiCenzo. Moreover, on the Unix platform, vendors either need access to the kernel or have to have their own file system layered on top, she says.

Some email archiving vendors are partnering with file archiving vendors, says DiCenzo. For example, CA resells Arkivio Inc., while Quest Software Inc. resells BridgeHead Software's product. "But there's a difference between bundling two products and having true integration," she says (see "Email archiving programs," PDF).

The archiving market is still fairly young and still changing, according to DiCenzo. "If you're going to wait for the perfect solution, you're still five to 10 years away," she says. But archiving has such strong benefits that it's worth looking at and beginning with today, even if it means you have to migrate in the future, she adds.



Click here for key considerations of Email archiving programs (PDF).



Dig Deeper on Data storage compliance and regulations