Many early adopters of data reduction for primary storage are NetApp dedupe
customers who have found the technology especially helpful in reducing the
capacity needs of their VMware Inc. virtual machine disk files (VMDK files
). We talked with three such IT organizations about how they're using the technology.
Insurance firm dedupes for space savings, improves replication for DR
Glatfelter Insurance Group, based in York, Pa., sees space savings in the range of 42% to 88% per volume with NetApp deduplication
. At the high end of the data reduction
spectrum are storage volumes associated with the company's VMware ESX servers. At the low end are volumes for the Linux servers that process voice recordings of insurance claims.
"We're seeing tremendous savings across the board," said David Pittman, chief technology officer at Glatfelter Insurance Group.
Glatfelter uses iSCSI-based storage with its VMware environment and creates NFS mounts to present to the ESX hosts. When the company brings up a new virtual machine
(VM), it builds a folder on the NFS mount for all of the files, including the pointers of the deduplicated ones.
A company can see substantial space savings when it has to store only one copy of the operating system used by multiple VMs. If an ESX Server has 20 VMs running Windows Server 2003, a system with data deduplication
saves one copy of the server operating system and points to that copy rather than storing an additional one for each of the other 19 virtual machines.
Pittman said that grouping virtual machines running the same version of Windows Server on the same NFS mount produces greater capacity savings. Although Glatfelter Insurance Group didn't set out by design to group like versions of the operating system on the same NFS mount with its initial deduplication implementation, it did benefit by "blind luck" when it started deploying its NetApp FAS3040
s, according to Pittman.
"Every once in a while, a blind squirrel finds a nut," he said, noting that Glatfelter saves 40 GB of disk space per VM in some cases.
Glatfelter Insurance Group has 25 volumes of production VMware NFS mounts and backs up and mirrors all of them to a disaster recovery (DR) site that also has NetApp storage. All 50 of the NFS mounts are available to the ESX hosts from the production or backup environments, and all run NetApp dedupe, according to Pittman.
Pittman said NetApp's 16 TB volume limit for deduplication hasn't been a problem, although he's pleased the new version promises to support larger volumes. He also hasn't noticed performance problems, mainly because the company uses deduplication during off-peak hours. NetApp's dedupe runs post-process, after the writes occur.
Pittman said the benefits of deduplication extend beyond the primary storage
capacity savings. Performance of data replication to Glatfelter's DR site has improved because the system sends less data across the wire. The company also saves time and management costs, Pittman said.
"This has kind of overwhelmed us," Pittman said. He said the company plans to look at CIFS shares later this year for its next phase of deduplication, and there are no data sets he would hesitate to dedupe.
University squeezes virtual machines and plans for compression
The University of British Columbia in Vancouver has logged more than two years of experience with NetApp data deduplication
for its virtual server and virtual desktop environments as well as unstructured data.
Storage capacity savings have reached approximately 60% in the VMware virtual server environment (which consists of about 800 VMs on 48 physical servers), 40% to 50% with the virtual desktop infrastructure (about 150 virtual desktop infrastructures, or VDIs, on three blade servers, with plans to go to 600 VDIs), and 5% to 10% with unstructured data that end users generate.
"We don't really have a [performance] hit to our filers to speak of. It's not noticeable," said Michael Thorson, director of infrastructure at the university.
Thorson said the school dedupes only "the volumes that make sense." For instance, the university doesn't use NetApp dedupe on its Oracle Corp. and MySQL databases. Email didn't make sense either. The IT team tried Microsoft's built-in single instance storage (SIS) with Exchange Server 2007 but saw less than 1% disk savings and decided to turn it off.
Exchange Server 2010 no longer includes SIS, raising the prospect of NetApp deduplication filling the gap. A NetApp official claimed some early customer tests showed its deduplication saved 25% to 30%. But the university has no plans to try it, according to Thorson.
The university does, however, plan to use compression
when NetApp makes it available with the next release of the Data Ontap operating system, Thorson said. Overall, the university has six NetApp FAS3170s with approximately 3 PB of data.
Airport group uses dedupe with CIFS, NFS
The Alexandria, Va.-based American Association of Airport Executives, with overall data in the low terabytes, uses NetApp dedupe with its CIFS-based shared drives and five NFS volumes for its VMware environments. Space savings have been 30% with the CIFS data and 22% for VMware.
Although the association hasn't encountered any data corruption issues in more than 20 months of using NetApp deduplication with its FAS3140s, Patrick Osborne, the association's senior vice president of IT, remains uncomfortable about using the technology with his most mission-critical data sets, such as training videos and biometric files.
Osborne also hasn't devoted any time or energy into trying to squeeze out greater capacity savings through NetApp dedupe. "For us, it's always been icing on the cake," he said.
Find helpful information about NetApp user groups in the UK
This was first published in November 2010