Home > Learning to make the most of primary storage optimization
Feature:
EMAIL THIS

Learning to make the most of primary storage optimization

10 Nov 2008 | Alan R. Earls, Contributor

Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   

Until recently, schemes to improve storage efficiency were not applied to primary storage. Primary storage was considered sacrosanct. No one wanted to mess with something so important and besides, there was the much larger target of swollen backup files in secondary storage.

However, times have changed. Better and more efficient management of secondary data (often now deduplicated) has put primary storage in the crosshairs for several vendors and their customers. While the methods vary considerably, the goals are still much the same -- reduce the size of primary storage as much as possible through compression and deduplication techniques. This will not only reduce the costs of primary storage, but it will also help to proportionally reduce downstream needs for backup space.

How to best accomplish primary storage optimization

According to Larry Freeman, senior marketing manager for storage efficiency at NetApp, before embarking on new technology to enhance primary storage efficiency, you should make sure you are applying thin provisioning as much as possible. This technology has already demonstrated its ability to dramatically reduce overall primary storage requirements.

What's more, it does not require any special manipulation of the primary data itself. That said, however, Freeman also sees a place for data deduplication in primary storage, particularly in virtualized environments where there can be unnecessary multiples of files present for each virtual machine. Freeman says, "Even though the resource loads may be light in high-performance applications, deduplication is an extra burden that can hurt performance."

Data that is refreshed frequently will not benefit significantly from data deduplication strategies. "We do see opportunities in unstructured data that is typically stored inefficiently," Freeman says. "And we expect to see more deduplication in Sharepoint and Exchange server applications and even lightly used databases."

In keeping with the style of deduplication favored by NetApp, Freeman stresses that data deduplication can be accomplished best, with the least impact on performance, when it is done in the background rather than inline. "You can take advantage of periods of low activity like nights and weekends," he says.

Freeman also recommends proceeding slowly. "Some organizations try to apply this right away to every volume and then they wonder why the system is slow," he says. "We recommend starting slowly, then watch, look and learn." It also makes sense, he notes, to start with volumes you believe have lots of duplicate data and where performance isn't as much of a concern.

A place for inline and post-processing primary storage

John Matze, vice president of business development at Hifn, takes a different view. When it comes to primary storage, he says, there is a place for both inline and post-processing approaches. Matze recommends understanding your data and how it is used. That will help you select the best optimization methods.

Post-processing can cause problems, such as impacting backup windows. With post-processing, you must depend on the operating system to provide a capable caching environment. "It takes time to do post processing," says Matze, "but if you have a good caching environment you won't feel it as much." Post-processing also leaves you dependent on "pointers" to reconstruct the data, which can be endangered if there is a system crash.

On the other hand, says Matze, Inline optimization "gets your data clean" from the start. This can have benefits in storing and handling data thereafter.

Finally, a storage administrator should have realistic expectations. "Deduplicating backup data can produce tremendous efficiencies," he says. "But there are, in fact, fewer instances of duplicated data in primary storage compared with secondary storage."

How to optimize primary storage

Peter Smails, vice president of worldwide marketing at Storwize, lists four requirements for successful primary storage optimization.

  1. If it is to be worth your effort, you must be able to provide a high average data reduction.
  2. You must be able to minimize your impact on performance as much as possible. Ideally you would actually create a performance benefit but at a minimum you should be transparent to the users in terms of performance.
  3. You shouldn't require behavior changes on the part of users. The performance enhancements should happen without requiring extra actions.
  4. When you are dealing with primary storage, you should aim to run business-critical applications without impacting their availability, regardless of what you are doing with compressing or deduping data.

Primary storage and compression

One of the less recognized facts of primary storage optimization is that "many of the file types that are driving growth are already compressed," says Carter George, vice president of products at Ocarina Networks. As examples, he cites most of the document types available through Microsoft Office 2007, as well as Adobe PDF files. "When you try to further compress files like these," he notes, "you usually end up making a file that is larger."

Furthermore, the native compression scheme means that data deduplication efforts may not be able to spot files that are essentially the same. "When Microsoft compresses these files," George says, "the output is randomized so you can't recognize when two files are 99% identical."

Users also need to think about risk. "Enterprise customers really don't like to think about a situation where the original document never even existed -- as can happen with inline compression/data deduplication," George says. " No code is completely bug free, so when your only copy of a file is compressed from the start, you could end up writing garbage to disk."

Some compliance requirements mandate that for certain kinds of documents you must be able to show that archival activities didn't change any aspect of the document. According to George, "That's a high bar, but you must be able to meet it."

On the other hand, out-of-band solutions offer the opportunity to back up or snapshot and then shrink the file later.

In short, primary storage optimization seems to offer significant potential for enhanced efficiencies. But how you optimize your primary storage depends on competing technology visions. For now, the best advice may be caveat emptor -- let the buyer beware.

About the author: Alan R. Earls is a Boston-area writer focusing on the intersection of technology and business.



BROWSE BY TAG
Data Storage Management,   Primary storage capacity optimization,   VIEW ALL TAGS

Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   


RELATED CONTENT
Primary storage capacity optimization
Podcast: Integrating reporting tools into your storage infrastructure
Tiered storage, data reduction technologies manage capacity growth for companies as IT budgets shrink
Storage Decisions Chicago 2009 Session Downloads
Storage Decisions Session Downloads: Data Retention & Retrieval Track (Chicago 2009)
Storage Decisions Session Downloads: Storage Systems & Storage Management Track (Chicago 2009)
Growing data storage infrastructure may require new provisioning tools
Data storage provisioning best practices
MAID technology remains underutilized
The state of MAID in data centers
MAID product roundup

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary




Find Data Reduction and Deduplication White Papers
TechTarget Storage Media
Storage Magazine View this month\\'s issue and subscribe today.
Storage Decisions Apply online for free conference admission.
SearchStorage.com
HomeNewsMagazineTopicsLearningMultimediaWhite PapersBlogsEventsAbout Us

About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2000 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts