Eric Zuspan, senior system administrator of SAN/Unix at MultiCare Health System, says his department first began looking for a fast, scalable disk system for its Unix backups and didn't care if it included data deduplication. MultiCare backs up close to 200 TB each week for four hospitals and dozens of clinics, but "deduplication wasn't as important as getting away from being a tape jockey," says Zuspan.
Zuspan says MultiCare also evaluated a comparable product from Diligent Technologies Corp. before it was bought by IBM Corp., but wanted a pre-assembled appliance rather than having to provide hardware to go with Diligent's software. In addition to potentially complicating support, Zuspan says the Diligent software made it difficult at that time to provision underlying storage hardware using MultiCare's IBM SAN Volume Controller (SVC) storage virtualization appliance.
Meanwhile, the Windows team at MultiCare had deployed Data Domain's DD460 and DD560 deduplicating disk arrays for disk-based backup. Data Domain is the most widely deployed data deduplication product with approximately 2,500 installations. Like most Data Domain customers, MultiCare's Windows division was using Data Domain's inline data deduplication with a NAS interface and Ethernet connectivity.
Data Domain doesn't yet offer data deduplication across multiple nodes, although company officials say it's on the roadmap. In the meantime, MultiCare's Windows team ran up against the capacity limits of first one and then a second Data Domain box last year, and started writing to the Sepaton VTL instead. Given the slower Ethernet interfaces and inline approach to dedupe on the Data Domain box, Zuspan says "performance was pretty limited" for the Windows team as well. A typical Windows backup that had taken four and a half hours with Data Domain took an hour and 20 minutes with Sepaton. Zuspan says his company's Windows team still uses Data Domain, but may phase out its arrays over time.
The Windows team actually used Sepaton's DeltaStor data deduplication before Zuspan's Unix team. That's because Sepaton's "application-aware" deduplication must be certified separately with each backup vendor it wants to work with, while Data Domain and other vendors offer the same deduplication regardless of backup software. The Windows team was using Symantec Corp.'s Veritas NetBackup, the first backup application certified with DeltaStor. The Unix team used Hewlett-Packard (HP) Co.'s Data Protector backup software; however, Sepaton didn't support Data Protector until late last year.
But Zuspan says Sepaton's deduplication was worth the wait. "[Because] it's application-aware, it can see by the signature of data that it's, say, Exchange email, and then runs an algorithm based on that," he says. This leads to "huge" deduplication ratios for applications like Exchange, where Zuspan says the S2100-ES2 is currently storing 619 logical GB in 6.18 GB of physical space—a 100:1 data-reduction ratio. The overall 200 TB backup, which includes files such as radiology images that can't be deduped, is stored on 75 physical TBs.
Sepaton's deduplication is also post-process, which some observers say makes it difficult to meet backup windows to offsite media through replication or copies to tape. But Zuspan points out another unique Sepaton dedupe feature called "forward differencing," which stores the most recent backup while deleting redundant data from older backup sets. This makes short-term restores and copies to tape easier, he notes, because the most recent full backup is stored quickly and intact.