This article can also be found in the Premium Editorial Download "Storage magazine: Upgrade path bumpy for major backup app."

Download it now to read this article plus other related content.

Two types of archivers
Archive systems can be roughly divided into two categories depending on the way they store data. The first is the traditional, low-retrieval archive system attached to your backup software package. Such an archive system lets you make an archive of a selected group of files and attach limited meta data to it, such as "widget XYZ," and then have the archive system delete the backup files in question. The good thing is that it allows the attachment of meta data and can reduce multiple copies in the archive by deleting the duplicate backup files as they're archived. The bad news is that if you want to search archives using different types of meta data--such as owner, time frame, etc.--you need to create multiple archives. The main use for this type of archive is to save space by deleting files attached to projects or entities that are no longer active.

The second--and newer--category of archive systems realizes that any archived item might need to be retrieved for different reasons and would thus require different meta data. To support multiple types of retrievals, it's important to store the actual archived item only once, but with all of its meta data in a searchable database. Such a system realizes that a given archived item might be put into the archive not to save space, but to allow it to be searched for logically. Unlike its predecessors that stored the only copies of reference data, newer archive programs

Requires Free Membership to View

store an extra copy of the data, leaving the original in place.

As discussed previously, one of the problems with using backups as archives is that they won't have all occurrences of a file or message; they'll have only those items that were available when the backup was made. Some of the newer archive systems solve this problem by archiving data automatically. For example, every e-mail that comes in or is sent out is captured by the archiving system. Every time a file is saved, a version of the file is sent to the archive system.

Another advantage of newer archive systems is their use of single-instance store and delta incremental concepts. They store only one copy of a file or e-mail, no matter where it came from or who it went to. (Of course, the archiving system records who it came from or who it was sent to.) If that file or e-mail is then changed and sent/stored again, the archiving application will store only the changed bytes in the new version. Single-instance store saves a lot of disk space.

Regarding the format issues of backups as archives, many archive systems still grapple with those issues (see "Turning backups into archives"). Many people still store their archives on tape and, as time passes, may change their archive software. Therefore, this problem could persist even for archives (see "Which is best for archiving: Disk or tape?").

Newer archiving systems also serve as a hierarchical storage management-like system, automatically deleting large, older files and e-mails, and invisibly replacing them with stubs that automatically retrieve the appropriate content when accessed. This is one of the main business justifications used to sell e-mail archive software. In addition to satisfying e-discovery requests, you can save a lot of space by archiving redundant and unneeded e-mails and attachments.

Surveys show that more than 90% of typical e-mail storage is consumed by attachments. If you can store only one copy of an attachment across multiple e-mail servers (and Exchange Storage Groups) and replace it with a stub, then you can save a lot of storage. If you add delta-block incrementals to that, you can save even more storage.

If your company has more than one employee, it wouldn't be hard to build a business case for archiving. And if you're using backups as archives, you could be in for a pretty rough time when you get an electronic discovery request. Perhaps you should look at an e-mail archiving product or an enterprise content management product today.

This was first published in September 2006

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: