Maximizing your enterprise data storage capacity is more important than ever before. With the majority of storage budgets unable to keep pace with data growth rates, most storage administrators need to find ways to reduce capital and operating expenses to survive the economic downturn. In general, before adding new capacity, you should focus on making better use of your already installed capacity. And while there's no simple checklist...
to accomplish this, you can improve the efficiency of your storage systems using tiered storage; storage resource management (SRM), data classification and data migration tools; and techniques such as thin provisioning and data reduction.
For many organizations, the first step is to tier storage. This places mission-critical and frequently accessed information on more expensive media, while infrequently accessed and archival data moves to less expensive and higher-capacity drives. Capacity management tools, including some that produce granular information about utilization and performance, can help you determine your tiering strategy.
A June 2009 Snapshot survey conducted by Storage magazine found that 59% of the 112 respondents had already implemented a tiered storage architecture. Of those who said they didn't currently have tiered storage, 20% planned to implement a tiered storage architecture, with more than half of that number planning to do so within 12 months.
But even if you have a tiered storage system, you could be using the installed capacity inefficiently. Capacity management tools can help identify problems, while techniques such as thin provisioning and data reduction can increase utilization rates and save space. Together, these practices can help you get more use out of your systems and delay future purchases.
SRM tools provide granular view into asset usage
Storage resource management (SRM) tools can help storage administrators gain a solid understanding of their physical/virtual assets and needs, develop a tiered storage strategy that incorporates performance requirements and budget limitations, and move data to appropriate tiers. As these tools mature, matching assets to needs becomes easier. Newer capacity management and data mapping tools give administrators more granular views into asset usage, data storage needs, performance and virtual environments.
For example, tools such as EMC Corp.'s Ionix ControlCenter and Quest Software Inc.'s Foglight allow IT organizations to discover all of the IT assets used in the storage-area network (SAN) or in network-attached storage (NAS) environments. More sophisticated array-specific tools, such as Hewlett-Packard (HP) Co.'s Storage Essentials Performance Pack Enterprise plug-in and NetApp's SANscreen storage services management software, can find unused capacity, discover physical and virtual data paths, obtain device configurations, map virtual assets and sniff out utilization patterns using low-level data assembly. By trending this data over time, storage admins can see how applications perform, which applications require the most resources, and whether they have reached a point where a lack of resources will hurt performance.
Jeff Boles, senior analyst and director of validation services at Hopkinton, Mass.-based Taneja Group, said access to granular utilization and performance data is relatively new for all but high-end, monolithic storage systems. "If you have a tool that can peer down into the fabric or the network, and see what's going on at the packet level and roll that information up, then that's really interesting," Boles said.
Utilization pattern data can help administrators understand how well their storage system handles I/O requests. Granular performance data can help determine if performance and I/O issues are forcing the organization to use more disks than its capacity requirements dictate, as well as which applications and users are at the root of the problem. "This is one more perspective on how you're utilizing storage," Boles said. "It can give you a better perspective on what your storage is really capable of."
According to HP, its Storage Essentials SRM suite has typical SRM software capabilities, while its Storage Essentials Performance Pack Enterprise plug-in is "path-aware" performance management software. The plug-in can track performance statistics in real-time and trend performance data for days, weeks or months over heterogeneous SAN devices and applications.
NetApp's SANscreen SRM tool -- which NetApp acquired with Onaro Inc. in January 2008 -- provides visibility into application performance and virtualized environments. According to company representatives, it can give details about a heterogeneous environment's capacity consumption, storage resource allocation, device configuration and access paths.
Pyramid-shaped tiering model matches performance needs to budget
Once you've completed a thorough examination of your environment's assets, capacity and storage needs, it's time to take action. According to Boles, a pyramid-shaped tiering model is emerging with solid-state drives (SSDs) at the very top and tape, virtual tape libraries (VTLs) and cloud services at the base of the pyramid. The idea is to place expensive, high-performing but lower-capacity disks at the top of the model to handle intense I/O demands, and to put less-expensive, high-capacity drives or services at the bottom to hold infrequently used and archival data. Implemented correctly, tiering enables administrators to more closely match performance needs to budgetary requirements.
For some organizations, the pyramid model starts with high-performance, very expensive SSDs at the top, which the industry calls tier 0. Tier 1 includes high-performance, more expensive Fibre Channel (FC) drives and sometimes SAS drives. Tier 2 uses lower-performance, higher-capacity, less-expensive Serial ATA (SATA) drives and sometimes even SAS drives. Tier 3 is for data archives and rarely used data, and includes the least-expensive storage alternatives, including tape, VTLs and perhaps cloud storage. The key is "applying the right cost and performing technology for the application's needs," said Kyle Fitze, HP's storage platform director of marketing.
♦ THE EXPANDING TIERED STORAGE MODEL: The tiered storage model is growing, with solid-state drives at the top and cloud backup at the bottom. Find out how these two storage tiers can fit into your storage infrastructure, and learn about changes within the other tiers of storage.
Classification and migration tools move data to correct storage tier
By classifying data according to performance and access criteria, you can determine what storage tiers your infrastructure requires and move data to the right storage asset.
Data classification has been a particularly odious task in the past. More than 42% of Storage magazine survey respondents with a tiered storage architecture said their biggest pain point with tiered storage is classifying data so it's sent to the right tier.
Physically moving the data is also a major chore. More than 36% of those surveyed said they have to move data manually between tiers; in addition, roughly 35% said they use both manual and automatic processes to move data.
These pain points are likely to improve. Most first-generation classification tool vendors have now moved into the lucrative e-discovery market and the economy has slowed development of second-generation classification products. But the market attention on cloud services for lowest cost archiving, as well as media attention on cost optimization, could spur the development of tools that efficiently and easily move data from tier to tier.
"I think we're getting to the point where things are so cost compelling in that bottom tier that you have to take a closer look at how you get data there," Boles said. "Maybe that will drive more opportunities and more adoption of classification tool sets."
One company moving into the automated classification and data movement market is SmApper Technologies GmbH. SmApper's data intelligence layer allows policy-based data movement for tiered storage management.
Compellent Technologies Inc. offers Storage Center Data Progression, an automated, block-level data classification and migration tool. It automatically classifies data and moves it bi-directionally, enabling you to define tiers by rotation speed, disk drive type or RAID level.
Higher storage utilization rates with thin provisioning
Working hand in hand with a tiered storage architecture is thin provisioning, a technique that increases storage utilization rates. Operating at the block level on primary storage, thin provisioning allows administrators to dedicate virtual storage capacity to applications instead of having to pre-allocate physical capacity. The feature allocates physical storage capacity to applications only when the application actually writes data to disk.
Craig Nunes, vice president of marketing at 3PAR Inc., said his company's research indicates that only 19% of a typical data center's capacity has written data. The remaining capacity is reserved for future application needs that may never occur.
By using thin provisioning, IT administrators can get utilization rates close to 100% and delay capacity purchases until assets approach full capacity. "Generally, we find that administrators are buying anywhere from 25% to 50% of what they would have bought," Nunes said. And when purchases are made, they can take advantage of dropping hardware prices.
A number of storage equipment vendors offer thin provisioning as a feature of a capacity management tool. 3PAR offers thin provisioning technology as part of its InForm software suite. NetApp's Provisioning Manager thin provisioning technology is in its storage management software suite. HP's StorageWorks XP24000 and XP20000 Disk Arrays feature the company's XP Thin Provisioning software.
While many vendors are including thin provisioning in their products, the technology hasn't yet reached many production environments, according to an October 2008 SearchStorage.com Storage Priorities survey. Of the 208 survey respondents that divulged their 2009 plans, only 13.5% said they had already implemented thin provisioning in their environments; 21% said they would deploy the technology that year, while 35% said they would evaluate it. Thin provisioning should experience increased utilization as administrators search for strategies and technologies that help them stretch their limited budgets.
Use data reduction techniques to eliminate duplicate data
Data reduction techniques such as data deduplication and single-instance storage (SIS) can now be used to remove duplicate data and to stuff the maximum amount of data into your existing capacity. These techniques are best used during backup, and on second- or third- tier storage where duplicate data entries are likely to meet.
Taneja Group's Boles said typical real-world deduplication ratios hover around 15:1. The more backup data you keep and expose to the deduplication engine, the better results you'll see. Although deduplication is currently limited to second- and third-tier storage, Boles said, manufacturers are making the technology's algorithms more capable and applicable to primary storage data. "I expect next-generation deduplication technologies to evolve to the point where they're much more applicable to more primary-type data sets," Boles said.
Compression is another key technique that can reduce your company's data footprint. It operates universally across all data, not just on redundant data. On archived data in particular, compression can deliver big space reductions.
The least-expensive storage is what you already own
"We tell customers the least-expensive storage is the storage you already own," HP's Fitze said. "The best thing you can do is to free up that stranded capacity by intelligent tiering, and moving data that doesn't need as much performance or maybe just needs higher-capacity disk drives. Moving [data] off your enterprise drives allows you to take that investment you've already made and use it to support the growth of your applications and your business." Thin provisioning and data reduction techniques then increase utilization and maximize the capacity of your storage assets, which delays or eliminates the need for storage capacity purchases.