Case study: NY Mets add deduplication to roster


This article can also be found in the Premium Editorial Download "Storage magazine: Multiprotocol arrays provide NAS and SAN in a single box."

Download it now to read this article plus other related content.

Different flavors of deduplication
The key to controlling the growth of data is deduplication technology, which comes in the following forms.

Application server dedupe. Performs dedupe through software running on the application server before the data is sent to the backup server. This reduces the amount of data sent over the network to be backed up, but adds processing overhead to the application server.

Block level. Provides more granular dedupe by looking at blocks within files that have changed. It will save only those blocks that have changed, not the entire file.

Inline. Intercepts the data on its way to the disk array and performs the deduplication function before writing the data to disk. The stored data is fully deduped and available for immediate replication or other use. This approach can impact performance.

Postprocessing. Performs deduplication after the data has been stored on the array. It avoids a performance hit, but requires additional storage capacity on the backup system to accommodate the data as it's deduplicated.

Single-instance storage. Performs dedupe at the file level, which limits the amount of potential

Requires Free Membership to View

reduction by looking only at the file and not drilling down to the block level for duplication. For instance, if the contents of a file are left unchanged and only its name is changed, file-level dedupe won't recognize the duplication. In other words, the file-level dedupe will see it as a new file and not eliminate it as a duplicate.

Compression is often used in conjunction with dedupe to further reduce the volume of stored data. In most cases, companies first dedupe and then compress.

Source: Lauren Whitehouse, Enterprise Strategy Group

The Data Domain appliance uses inline deduplication, which performs data reduction before the data is stored on the disk. This means the data can be replicated or otherwise managed immediately on hitting the disk. But it takes a performance hit in the process (see "Different flavors of deduplication," above).

Once Milone chose Data Domain, the implementation went without a hitch. "The Data Domain appliance just attached to our backup server," he says. The IT staff handled most of the deployment with the help of a Data Domain engineer, who spent a day preparing the environment and returned a few days later to verify that everything went in correctly.

Each D2D backup appliance handles the servers at its location. In addition, data at Sterling is replicated to Shea Stadium. The organization, however, hasn't eliminated tape completely. "We still do tapes at Shea," says Milone. That will end with the next phase, which involves either replicating Shea Stadium backups to Sterling or, more likely, to a third Sterling property that will house another Data Domain appliance. At that point, both data centers will replicate to the third site and tape will disappear.

For now, data backups are happening faster and are more reliable than ever. "My staff loves it," says Milone. Whether the Mets win or lose, "I sleep a lot better now," he says.

This was first published in March 2008

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: