Manage Learn to apply best practices and optimize your operations.

Cost-effective legacy data protection

For business, legal, regulatory and compliance reasons it is often important to protect data that is a decade old or more. Learn what you can do to cut data protection costs on older data.

What you will learn from this tip: Not all of the data you need to back up and save was produced today -- or last week, or even last year. For business, legal, regulatory and compliance reasons, it is often important to protect data that is a decade old or more. Learn what you can do to cut data protection costs on older data.

Economics rears its ugly head in a big way when dealing with legacy data. Most of the time all of that old data represents a cost sink rather than a profit center. You may need to keep it on hand in case the regulators or lawyers come calling, but you aren't going to be able to generate any more value from it. At the same time, the data may have been recorded on systems that are three or more generations back and you'd like to keep it in a form which is readable by your current systems.

Archiving information

Archive or backup?

Choosing an email archiving strategy

E-mail archiving with CAS

While you can and may just store those old reels of 9-track tape written in VAX VMS format and hope that you never, ever need that data (or if you do need it you can somehow, somewhere find a system which will let you read) it is usually better to store that data on media and in a format that you can read. Assuming, of course, you can do it cheaply enough.

Fortunately, there are a number of things you can do with that legacy data to reduce the cost of keeping it around in a readable format.

  • Analyze it
    The first step is to figure out what you've got, how often you're going to need it and what form you're likely to need it in. "Legacy data" covers an enormous range of material, with an equally enormous range of value and accessibility requirements.

    Some organizations, such as oil companies and scientific institutions, have large amounts, often multiple terabytes (TB), of data that may have been collected years, or even decades, ago -- and which is still frequently used.

    Most enterprises will have data that must be preserved for regulatory or legal reasons and which will probably never be looked at again. However, some of that material, like archives of email messages, will need to be searched through quickly for specific message threads if it ever is needed. You've got to know what to do with it.

    Data formats are another important consideration. You not only need to have the data on media you can read, you need to have it in a format your current systems can handle. It doesn't do any good to carefully transfer those old files onto new media if the files are formatted for an application you discarded years ago. You may have to convert the data, as well as translate the media.

  • Prune it
    The real question is how much of this data do you want to protect? Typically a lot of 'legacy' data, perhaps 80% of it or more, isn't needed. It makes sense to do some serious housekeeping before you do any conversion.

    Many of the decisions on what to keep and what to discard can't be made by IT alone. They require input from the people who generated the data in the first place, as well as other departments such as legal and accounting.

  • Select the right technology
    After pruning, the data you're left with may have to be kept around forever. This introduces some considerations in storage. Cost per gigabyte isn't the only consideration in choosing a technology for storing old data. All existing media have a certain lifespan and preserving data permanently means rewriting it to media before that lifespan expires. True, that will typically be 10 years or more down the road, but you need to consider the cost of transferring the data when the time comes. It may make sense to choose a longer-lived medium, such as optical disk, even if it has a higher cost, to cut down on the expense of later storage transfers.

    It also pays to think ahead, especially in the area of formats. For example, converting text-type data to XML will make it a lot more accessible and easier to manipulate in the future -- factors that may pay off. Similarly, it's obvious that you're probably going to want to convert EBDIC data to ASCII, but you might want to consider taking that a step further and putting text data, EBDIC or ASCII into Unicode format.

  • Consider outsourcing
    Even with careful pruning, legacy data can amount to several TB of information. In the case of large or complex data migration projects, it may be more cost-effective to outsource the conversion and associated services.

    There are a number of companies which specialize in transferring data, including Disc Interchange Service Company (DISC), which has a number of brief articles on various aspects of file conversion available on its web site, and Appian Analytics.

  • Store it appropriately
    Storing media under the proper conditions will significantly prolong its life. For most media, especially tape, the most important factors are temperature and humidity.

    The other issue is making sure the media containing your legacy data is properly indexed and cataloged. Make sure all the media are properly labeled and you have a catalog showing where each tape or disk is stored. Then make sure it is actually kept in that place.

    Do you know…

    How to deal with tape errors?

    About the author: Rick Cook has been writing about mass storage since the days when the term meant an 80 K floppy disk. The computers he learned on used ferrite cores and magnetic drums. For the last 20 years, he has been a freelance writer specializing in storage and other computer issues.

  • Dig Deeper on Data storage strategy

    Start the conversation

    Send me notifications when other members comment.

    Please create a username to comment.