| Dedupe and virtualization don't solve the real problem
It's no surprise that these two topics are top of mind these days. The server virtualization juggernaut isn't slowing at all--and it's likely to pick up steam with new or improved players like Virtual Iron Software and Microsoft challenging VMware's dominance. And storage vendors are hardly sitting on their hands. They're focusing on virtual server interoperation and integration with alliances such as Symantec's recently announced Veritas Virtual Infrastructure (VxVI); the coordinated venture with Citrix Systems bundles XenServer with Veritas Storage Foundation to create an integrated management package.
But there's still plenty of controversy about whether virtual servers screw up or enhance storage systems--whether they make it possible to use storage more efficiently or squander more precious disk capacity.
The dedupe front is just as boisterous. If you're a VTL vendor without built-in dedupe, you're ceding some serious ground to the leaders. It took more than a few years for disk to become as ubiquitous in backup environments as it is today. But in only a fraction of that time, dedupe has become an essential component of those environments, whether they use disk as a VTL or a staging cache.
But beyond deduplication becoming de rigueur for disk backup, the big brouhaha among dedupe vendors is the inline vs. post-processing dust-up. Inline proponents say it makes more sense to squeeze the air out of the data before it hits the disk. If the process starts to slow down, they say, just put the pedal to the metal by adding more horsepower to inline appliances. The post-processing camp says disrupting the data stream in any way is a no-no; it's far better to knead the data after it hits the disk. The scary part--and the one that makes purchasing decisions so hard--is that each argument is compelling. So much so that Quantum diplomatically offers both methods in a single package with its DXi-Series disk backup systems.
It's no wonder these are the two hottest topics in storage circles. Dedupe is booming because storage managers are desperately seeking solutions to cope with soaring disk capacities. Dealing with server virtualization is also part of that issue. I keep hearing how companies have slashed the number of physical servers and ended up with more virtual servers than the former physical server count. The math is easy: More servers mean more apps, and more apps mean more data.
Dedupe and virtual server integrations address the symptom of explosive data capacity growth, but not its basic causes. The technologies translate into a couple of fingers in the dike, which may stem the tide for now, but more leaks are bound to occur that require newer and better techniques.
Any sustained solution has to deal with the sources and types of data clogging corporate disks. You have to know who's creating all that stuff, and determine where it should it go and how it should be protected. To do any of that, you'll need to classify the data at the time of its creation. This is hard to do right now and most "solutions" require an awful lot of manual work.
Data classification verged on "hotness" a couple of years ago, but it's taken a back seat recently. That's a shame, because it might be the most important part of managing information and the systems that host it. It's a technology that storage vendors need to focus on, even while newer, cooler technologies generate so much buzz. Otherwise, storage managers could get burned.