News Stay informed about the latest enterprise technology news and product updates.

Wake up, smell the disk, and discover CO technologies

While the world of IT has accepted the use of tape, it's time that they introduce themselves to the world of capacity optimization (CO). Become familiar with this new technology through the eyes of Arun Taneja and learn why he feels CO will change the way you do business … for the better.

I am often asked by the IT and storage vendor community if I think the role of tape will change over the next three years, and if so, how? Good question … and one that the traditional tape companies, such as Advanced Digital Information Corp., Quantum Corp., Storage Technology Corp., Overland Storage Inc. and others, need to answer with accuracy just to survive. Everyone I talk to has an opinion on the matter and most think there will be a change. But I do not think the industry realizes the extent of the change coming. Here's the way I see it.

Tape has been IT's crutch for the past million years. They may have hated its sequential access properties; they may have cursed the media errors it generated. Indeed they may have screamed every time they went to recover from it (if they could find the right tapes) and couldn't. But tape is all they had. At least they could explain to their management why the darn recovery failed.

The procedures were well laid out. An Iron Mountain truck arrived on time and took the tapes away. IT may have hated the tapes but tapes have become part of their DNA. That has been the reality. That is the reality. The love/hate relationship with tape goes on.

Enter SATA. Enter disk-to-disk. Enter efficient, network-based replication. And most importantly, enter capacity CO technology. You already know about the first two. The concepts are not hard to understand. SATA is good enough. SATA is cheap. SATA allows me to introduce secondary disk for disk-based backup and restore, improving the speed and reliability of both. Excellent. But now what? I would suggest that CO is the rock star that, in conjunction with above, will change the way we think about tape.

So what do we mean by CO? CO is to compression as Muhammad Ali is to boxing. We generally think of compression in terms of 2:1, maybe as much as 3:1. Most commonly it is implemented in front of tape drives so that we can keep two to three times the data and reduce the number of tapes. But the latest CO techniques, as implemented in products from Avamar Technologies Inc., Data Domain, DCT (now Veritas Software Corp.), Permabit, Inc. and most recently in ProtecTIER from Diligent Technologies Corp., deliver effective compression ratios in the range of 25:1. Depending on the situation, maybe a 100:1 or more. Just imagine the power of such compression. Of course, it is done in a completely different method than traditional compression.

As an example, in traditional compression 1,000 sequential zeros would be kept as 1,000 0 to save space or to reduce transmission requirements. In contrast to this, CO uses very different principles. CO breaks an object, say a file, into smaller pieces, called chunks by some vendors, and finds a way to identify these chunks with unique codes and only stores them once (or by design, twice for redundancy purposes). Imagine the impact of this on a full backup: You can forget duplicate files because they are history. Forget the use of company logo on every internal document. Forget company boiler plates that are repeated a million times. Forget the use of common paragraphs, word art, images, CAD files, software releases and so on. You get the point. All these are shriveled into practically nothing.

Now imagine the power of CO applied to incremental backups. You know what happens to a file today that has only one changed byte -- the whole darn file is backed up again. Imagine if only the chunk that contained the changed byte was kept. Of course, to make all this effective, the computer has to keep a "plan" for which chunks make up a specific file. The impact of CO is enormous, and it is not theoretical. I have talked with many of you who have told me you are indeed getting a 25:1 reduction or more.

So what if SATA-based systems are still three times more expensive than tape implemented in a tape library? It doesn't matter. Just fill in your favorite number here. The fact is, if we apply a 25x reduction to disk, it is game over for tape. Well, maybe not so fast.

What about the removability factor? This is where the application of CO to replication comes into play. We already have instances where the CO concept has been applied to remote replication. Look at what Kashya Inc. or Topio Inc. have done, or Avamar for that matter. Now imagine that I have my backups being done on CO-based technology at the local site and replicated very efficiently and cost effectively across the state (or the country) to another disk system. Why do I need tape now? To feed Iron Mountain coffers? Heck, no. The only reason I can think of is what I call the "crutch factor." And lest we forget -- the "crutch factor" will probably exist for another three years before most of you are comfortable with disk technologies in their new role. Most of you will still backup to tape just to be on the safe side ... as you should. The world is changing too fast, but you have to pace yourself -- lest something happens and your management blames you for being too adventurous.

Of course, you could argue that CO can be applied just as effectively to tape, but somehow I think the random access nature of disk will win out big time. So if you are a tape vendor, I say wake up and smell the disk! If you belong to IT, I say pay attention and check out these CO technologies and make yourself look like a hero inside your company.

About the author: Arun Taneja is the founder and consulting analyst for the Taneja Group. Taneja writes columns and answers questions about data management and related topics.

Dig Deeper on Storage optimization

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.