Over the last five years, provisioning and capacity management have been way at the top of the results of almost...
every survey I have read on IT's biggest issues. Perhaps it wasn't as big an issue prior to that, especially in the dot-com era when you were penalized for not spending enough. Then, when the bubble burst and IT budgets came to a screeching halt, you were asked to find ways to live with what you had.
One IT manager told me his marching order included finding ways to live with existing primary storage for at least another year. Out of necessity, the guy investigated the age of all the data on his primary storage and discovered almost 60% had not been accessed or modified for more than 12 months. He had never done that analysis before, but he found an easy solution. He archived the old data and recovered 60% of the space on expensive primary storage.
I am sure you have your own horror stories on capacity management, but here's the question: "Can you do anything besides moving some data from time to time to an archive or tier-two storage? The answer is "yes, and it is called smart provisioning. One powerful technique that has recently become available is called thin provisioning (TP).
You see, eliminating old data is a good practice, but it is only a partial solution. You have to learn to allocate what you have intelligently. Before I describe TP in some detail, let's look at how you allocate storage to applications today.
If you are like most storage administrators, you have learned to over-allocate storage to an application as a means of keeping some level of sanity. The mission critical application needs 20 gigabytes (GB) today, but you and the database administrator decide it is best to allocate a 100 GB volume to cover for growth. Both of you realize that increasing the size of the volume is a disruptive phenomena and a pain in the neck time-wise. All the applications you have responsibility for add up to 1.2 terabytes (TB) on the basis of the allocated capacity. You know it is only a question of a week or two before you get asked to add another application or two. So you want to keep some spare capacity around. That's why you have a total of 2 TB physical capacity.
Let's say all the applications combined are using a total of 400 GB today… what is your capacity utilization? Sixty percent? Twenty percent? The correct answer is 20%, but I would bet you are giving your management the higher number. Yet, with the tools you have today, it would be suicidal to give them the lower number, especially for Windows applications where managing capacity is even more difficult and where you are probably operating at less than 20% efficiency.
As you dig deeper into this you realize there are two reasons for this low capacity utilization: over buying to make sure you are covered for unexpected needs and over allocating to ensure you do not have to constantly deal with the database administrator who is nice enough, but freaks out when her database crashes or slows down. TP is a technology and a concept that is designed to deal with both these issues. The concept of thin provisioning, first exemplified by 3PARdata in 2003, allows you to allocate, say, 100 GB to the above application, but only releases 20 GB initially.
Thereafter, it releases additional storage in small chunks, say 5 GB, when the application needs it… neither sooner nor later. And it does so non-disruptively to the application. Theoretically, one can allocate more than the 2 TB physical capacity across all applications. As the actual needs of the combined applications approaches 2 TB, alarms are sent to the administrator so he can order additional physical storage. The concept is that this allows one to buy storage cheaper on the assumption that cost/MB is constantly dropping and the further you can delay a purchase the less money you will spend.
The best example of TP I have seen is that of Cloverleaf, a network virtualization player. Let me run through an example using their terminology. Cloverleaf uses the concept of reserved, hard limit, soft limit and current capacity. Using the example above, the current would be 20 GB. Let's say reserved is 100 GB, hard limit is 300 GB and soft limit is 50 GB and the chunk is 5 GB. This means that 100 GB has been taken out of the pool (you can make this number very small if you wish). When the actual capacity utilization reaches 50 GB, you get a soft alert. If you have a runaway application that maliciously grabs storage you will get a hard alert at 300 GB. You can take appropriate actions as these alerts occur. Of course, as the cumulative physical capacity limits are reached you will get an alert so you can order more storage. At which point you want the latter alert depends on how much lead time you need for ordering, receiving and installing new storage. The addition of new storage and incorporation of that into the existing storage pool is non-disruptive in the case of Cloverleaf, but may not be all cases.
TP is one of the finest technologies to hit the market for primary storage capacity management in a decade. Maybe more. I highly recommend you investigate the possibility of applying thin provisioning to all your mission critical applications. But I want you to be aware of a few "gotchas" as you evaluate different products that have become available in the market.
You should check with your storage provider if the current product line supports TP. Chances are if it does not today, it will soon, since it is becoming a competitive issue for all vendors. I think HP has added it to EVA models. Of course, 3PARdata pioneered the concept and has had this functionality since 2003. Cloverleaf enables TP for any kind of storage. This is a major advantage since buying all new storage from 3PARdata, for instance, may not be a feasible alternative but making your "lesser" storage thin provisioning capable can be very appealing.
The gotchas I want you to watch out for have to do with the behavior of the databases, file systems and the operating systems (OS) as they interact with TP. In many cases, if you allocate 100 GB to the application, the application (or the file system associated with it) will mark all the entire 100 GB with metadata. The file system knows that while it only needs 20 GB, if it gets a 100 GB, it can accelerate application performance by placing data on several disk drives, for instance, and have them deliver data in parallel. If the application behaves in this fashion, realize that 100 GB is gone from the storage pool and is no longer available. This defeats the entire purpose of TP in the first place. So before you make a decision to buy TP, ask the vendor how they deal with individual application/OS/database/filesystem combinations that exist in your environment.
The other consideration (less so of a gotcha) is how snapshots are taken and the pool from which storage is drawn. In the case of Cloverleaf, the snapshots draw from a separate snapshot pool that also feeds virtual replicas. These are read/write snapshots that are used for making a volume available for data warehousing or testing or another application.
I love the concept of TP. I think it can be a major contributor towards bringing sanity to the provisioning process. I would suggest that it would even improve your relationship with the DBA as you now can give her a large quantity of storage, yet control the actual release of it in measured doses. Mind you, she has to play ball, too, and not blindly grab the large amount you allocate to her. TP should also help you make the process of buying new storage saner. It also will help identify which applications are runaway trains and need corralling. Provisioning should be fairly mindless and mostly automatic, with exception control. TP should help you get there. Check it out.
About the author: Arun Taneja is the founder and consulting analyst for the Taneja Group. Taneja writes columns and answers questions about data management and related topics.