News Stay informed about the latest enterprise technology news and product updates.

VendorFights: Data Deduplication Edition

With data deduplication in the news today, I recommend checking out the responses to Jon Toigo’s questionnaire for data deduplication vendors. I found his questions about backing up deduped data to tape and the potential legal ramifications of changing data through dedupe especially interesting. The responses from the vendors so far about hardware-based hashing are also interesting, in that they seem to break down according to whether or not their companies offer a hardware- or software-based product.

It would be pretty disappointing if Hifn’s announcement of hardware-based hashing led to a religious war around software- vs. hardware-based dedupe systems. It’s clear (and has been generally accepted, or so I thought) that hardware performs better than software, meaning it’s in users’ best interest to improve the throughput of data deduplication systems by moving processor-intensive calculations to hardware. And the dedupe market is full of enough FUD as it is.

Speaking of which, Data Domain and EMC are getting all slapper-fight about dedupe thanks to today’s product announcement from Data Domain (and attendant comparisons to EMC/Avamar), and the fact that EMC is planning to finally roll out deduping tape libraries at EMC World (based on Quantum’s dedupe).

EMC blogger Storagezilla calls the statement by DD in a press release that its new product is 17 times faster than Avamar’s RAIN grid “nose gold” (props for the phraseology, at least), and then points out that Avamar’s back end doesn’t actually do any deduping, which is something I still don’t quite get.

So Data Domain’s box is faster at de-dup than the Avamar back end which doesn’t do any de-dup.

Since the de-dup is host based and only globally unique data leaves the NIC do I get to count the aggregate de-dup performance of all the hosts being backed up?

Yes, I do!

How does Avamar decide what data is ‘globally unique’? If this is determined before data leaves the host, than that processing must be done at the host. ‘Zilla even says he can count the aggregate performance of all the hosts being backed up in the dedupe performance equation. . .which brings me back to the first point again: Avamar’s back end doesn’t do de-dupe, but it’s faster at dedupe than Data Domain anyway?

Chris Mellor explored this further:

Accrding to EMC, Avamar moves data at 10 GB/hr per node (moving unique sub-file data only). Avamar reduces typical file system data by 99.7 percent or more, so only 0.3 percent is moved daily in comparison to the amount that Data Domain has to move in conjunction with traditional backup software. This equals a 333x reduction compared to a traditional full backup (Avamar has customer data indicating as much as 500X, but 333X is a good average).

‘An EMC spokesperson’ (should we assume it was, or wasn’t, Storagezilla himself?) further stated to Mellor:

“Remember that Data Domain has to move all of the data to the box, so naturally they’re focusing on getting massive amounts of data in quickly. EMC Avamar never has to move all of that data, so instead we focus on de-dupe efficiency, high-availability and ease of restore. Attributes that are more meaningful to the customer concerned with effective backup operations. “

Again I ask, where does the determination that data is ‘globally unique’ take place? It’s got to be taking up processor cycles somewhere. The rate at which it makes those determinations, and where it makes those determinations, would be the apples-to-apples comparison with DD, which is making those calculations as data is fed into its single-box system.

All of that is overlooking that the real meat and potatoes when it comes to dedupe is single-stream performance, anyway — total aggregate throughput over groups of nodes (which is really what both vendors are talking about) doesn’t mean as much. For one thing, Data Domain’s aggregate isn’t really aggregate, because it doesn’t have a global namespace yet. For another, I fail to see how EMC can even quote an aggregate TB/hr figure when talking about a group of networked nodes. Doesn’t network speed factor in pretty heavily to that equation?

Personally, I don’t think either vendor is really putting it on the line in this discussion (c’mon guys, get MAD out there ;)!). And if Avamar really performs better than Data Domain, why isn’t its dedupe IP being used in EMC’s forthcoming VTLs? (EMC continues to deny this officially, or at least refuses to confirm, but there’s internal documentation floating around at this point that indicates Quantum is the partner.)

Meanwhile, according to EMC via Mellor:

EMC says Data Domain continues to compare apples and oranges because it wants to avoid the discussion that there are a number of different backup solutions that fit a variety of unique customer use cases.

I have to admit this made me chuckle. Most of the discussions I’ve had about EMC over the last year or so have involved their numerous backup and replication products and what the heck they’re going to do with them all long-term. Finally, it seems we have an answer: Turn it into a marketing talking point!

I don’t think Data Domain even really wants to avoid that subject, either. They’re well aware that there are a number of different products out there that fit different use cases, given their positioning specifically for SMBs who want to eliminate tape.

At the same time, it’s interesting to watch the EMC marketing machine fire itself up in anticipation of a new major announcement–the scale and coordination are something to behold. This market has already been a contentious one. It’ll be interesting to see what happens now that EMC’s throwing more of its chips on the table.

Join the conversation

32 comments

Send me notifications when other members comment.

Please create a username to comment.

Will RDS in Windows Server 2012 take business away from Citrix, VMware and Dell/Quest?
Cancel
I don't think so!
Cancel
No I don't think
Cancel
Just the fact that the business community has a solution available at no addtional cost that is proven and tech support is part of the server package.
Cancel
It will have a small impact but does not provide the best solution suite.
Cancel
It looks like RDS is the thing to use. And make Citrix VMware.
Cancel
Are you sure that the RDG is in the proper position in that image?
Cancel
In short, the Microsoft VDI solution has a scale problem. Microsoft relies on partners like Citrix to answer enterprise challenges that they can't do on thier own. Over time I think we'll see Windows Server do better with VDI scale, but it won't be at RTM of 2012. However, at lower scale (<500 desktops) the Windows Server solution will probably be best value and for limited enterprise deployments, it will be a simple and cost effective solution.
Cancel
Now that RemoteFX doesn't need a separate GPU and uses UDP, lookout!
Cancel
People have been predicting the demise of Citrix since the beta release of Windows 2000. Hasn't happened yet, because they continue to innovate and add value on top of the Microsoft platform. If they ever stop doing that, then they'll deserve to lose market share...but it seems to me that they've got a pretty good track record there.
Cancel
With Installation automation and management tools from DeskTopSites it makes the Microsoft solution a cost effective tidy option.
Cancel
blah
Cancel
Still no support for any OS other than Windows. Plus a lot of the new features aren't supported for less than Windows 8 and Server 2012
Cancel
Windows Server 2012 do have theor own Customers, but we will have to wait & watch if they can really take the market Share from other..
Cancel
Been testing it since beta and it's slick.
Cancel
It really depends on licensing. RDS may be the poor man's VDI if the all the componants are native to the Windows Server 2012. There will still be VDA cost, but if the backend cost is low, it may make a dent in the market.
Cancel
It is bound to make a dent even if small in the beginning. Especially if the VDI license cost tilts in their favor.
Cancel
possibly... (because Citrix are trying to do too much and consequently now lack focus, VMWare is becoming a very expensive alternative and Dell/Quest - who knows).
Cancel
Microsoft has tried this in the past. The "new" RemoteFX (RDP enhanced) protocol needs to be proven out. Server 2012 adoption will take quite a while.
Cancel
this is definitely a game changer, together with hyper-v, which is also way ahead now with XenServer and on par with VMware!
Cancel
desktopsites single pane manager is a great tool that can have you rolling out apps and desktops in 20 minutes, I have seen it with my own eyes, pretty cool.
Cancel
While VDI is bundled and has an easier deployment with WS2012, why pay extra for a VDI software?
Cancel
RDS in Windows server 2012 will take some part of business away from Citrix ,VMware, but it is only all of their business. There are lots of issues needing to solve during VDI deployment, while Microsoft can not do it on its own.
Cancel
Yes but I don't see this taking over in large scale deployments.
Cancel
Sure, only when it will shows its quality egual as the others.
Cancel
Microsoft is building a total platform for virtualization. No need for add ons. All third party vendors will be reduced to niche providers except for their "cult followers".
Cancel
I think Hyper-V VDI is wonderful
Cancel
Not all businesses/gov't entities/nonprofits need the robust capabilities and equally additional high cost added by Citrix now that Microsoft has hit the basics for VDI, RDSH & the Remote protocol with strong offerings.
Cancel
All MS; all the time.
Cancel
cool stuff
Cancel
ICA is still much more efficient protocol
Cancel
It will be a competitor and may not take away the business of others
Cancel

-ADS BY GOOGLE

SearchDisasterRecovery

SearchDataBackup

SearchConvergedInfrastructure

Close