Reducing their organization's data center footprint is a goal frequently listed by IT managers aiming to trim power usage and costs. But there might be other motivations you haven’t considered, and even some pitfalls. To drill into this topic, here’s an excerpt from industry expert Stephen Foskett’s presentation on “Reclaiming Capacity with Data Reduction for Primary Storage” from a recent Storage Decisions New York conference. Foskett argues that, while there are some obvious advantages to taking measures to reduce data, there can be unexpected consequences in terms of overall storage system performance.
Pros of data reduction systems
- Fewer disk arrays means less exposure to disk failure
- Less backup data means more capacity
According to Foskett, one chief motivator for implementing data reduction technologies is often overlooked. “Having less data to store means theoretically having fewer drives that are going to fail, which means less overall exposure to failure," he explained. "Mathematically, it works out that you're much less likely to experience a data loss if you use fewer drives. So that’s a good thing; you may not have thought about that in terms of using data reduction methods, but it works.”
Foskett added that data storage managers can argue another angle when looking at funding for data reduction projects: the data protection management angle. “If you’re going to store less data, then we can figure out a way to have less data to write to the backup system, which takes up less space on the tape, takes less time to copy and has a continuing cascade of benefits," he said. "We’ll also have more capacity available for use because we just shrunk that one little guy, which means we can put more data there. If we bought a 10 TB array and shrunk our used capacity from 2 TB to 1 TB, suddenly we’ve got that much more space to use."
Cons of data reduction systems
- More points of failure
- Decreased capacity utilization
The other side of the story is that once you’ve implemented a data reduction technology, such as compression or deduplication, you introduce more possible points of failure, Foskett said. “If you have data reduction systems in place -- whether it’s an appliance, a piece of software or some other kind of integrated system -- that means more things that can break. It’s also a bigger basket than can fail," he noted.
Continued Foskett: "The big one is the concentration of I/O. If we use fewer disk drives, that runs counter to what we as storage administrators try to do to improve performance. Basically, disks can only give us so much I/O performance. The only thing we can do is throw more disk at it or move it to SSDs [solid-state drives]. Or use faster disks, I suppose, but that only gets us so far. If we actually use less space, and this is something that’s come back to haunt some people who’ve really attacked storage utilization costs in their environments, suddenly performance becomes a huge problem. If you drove your utilization up to 60% [vs. 20% to 30%] and used half as much storage, that would mean half as many spindles, which means half as many IOPS, which means you may find yourself in big trouble in terms of delivering the kind of performance your applications need. Then you might have to turn to something like caching or tiered storage to get back that I/O you just lost, which may mean losing the financial benefit of your data reduction efforts."
More on data reduction systems
Primary data reduction tools gain traction in the market
A review of data reduction and deduping
What's your expected data reduction ratio?
Computer forensics: The importance of data reduction software tools
On a final note, Foskett said, "you’re actually reducing capacity utilization. This is a problem in a few different ways, but No. 1, somebody may come down to you and say, 'Remember that project we did a few years ago to improve capacity utilization? You reduced the data and now we’re back where we were, at low utilization again.' It’s not like you can give this stuff away. You’ve got storage; you can either use it or not use it, but you’ve already got it. If you can reduce the amount of data you’re storing, that can give you a little more headroom; but on the other hand, it doesn’t save you anything. Unless you power it down and sell it, it’s not like you’re getting any benefit from [reducing data]. So I’m concerned about the idea of using data reduction systems and what you get from it.”
Editor's Note: Remember to view the video to get the entire presentation on data reduction systems.
BIO: Stephen Foskett is an independent consultant and author specializing in enterprise storage and cloud computing. He is responsible for Gestalt IT, a community of independent IT thought leaders, and organizes their Tech Field Day events. He can be found online at GestaltIT.com, FoskettS.net, and on Twitter at @SFoskett.