Published: 12 Jul 2006
Looking for disk in all the wrong places
To understand storage consumption, you need to follow its trail from acquisition to actual use.
THE HIGH RATE of storage consumption and its associated costs continue to frustrate IT executives. The never-ending stream of approval requests for more storage invariably raises questions about where and how all of this storage is being consumed. This is typically the beginning of the quest to understand storage utilization or, more specifically, why utilization rates are so low and what can be done about it.
Source: GlassHouse Technologies
Analyzing utilization is a bit like reviewing the stats of a baseball game in the ninth inning--you have a slight chance of affecting the outcome of the game, but it's probably too late. Storage has already been allocated and any effort to reallocate or migrate it is likely to be rebuffed as too difficult or disruptive. Besides, while the actual utilization rate is important, it's just the tip of the iceberg. To better understand the problem of storage consumption, one must examine the overall request and provisioning process, and recognize the roles that data management and protection policies play.
Ask and ye shall receive
The request and provisioning process is a multistage effort, with approval, policy and fulfillment steps that often result in massive overallocation and poor utilization. How does this happen? Imperfect knowledge is one reason; we don't know how much storage we'll need, so we make an estimate and then pad that number for safety.
But why not ask for what's needed now and request more when and if it's needed? Typical purchase and allocation processes are simply not structured to support this. First, there's the challenge of acquisition. The funding vehicle for storage is often a new project, and funds may be available only at the time of the project launch. Another contributing factor is concern about the impact of making changes to a production environment, so the tendency is to add storage now to avoid future disruptions.
The shrinking storage array
Data protection policies and the provisioning process compound the problem. Local politicians complain about sending a dollar to Washington and getting back a quarter--the storage provisioning process seems to follow the same pattern.
Here's how a quantity of storage shrinks at each stage as it's configured and provisioned (Chart 1):
Let me explain:
PHYSICAL: This represents raw, unconfigured storage; it's the aggregate amount that someone signing off on a purchasing request sees. A small amount of capacity (the light-colored part of the column) is lost due to overhead--management volumes, hot spares, etc.
LOGICAL: This is available storage after applying data protection policies. Protection policies are RAID levels that reduce the available physical storage. The gap between physical and logical is predictable and driven solely by the organization's protection policies. The difference between RAID 10 and RAID 5, for example, can dramatically impact storage costs.
ALLOCATED: When service requests are made, logical storage is allocated. In most organizations, however, the logical storage pool is rarely 100% depleted because some capacity is set aside for unanticipated needs or emergencies. The size of this buffer is directly affected by the efficiency of the organization's purchasing process.
CLAIMED: Allocated storage must be claimed and used. In large organizations, some allocated volumes remain unclaimed. These "orphan" LUNs were handed off from the storage admin to a system admin, but never put into production. Another source of orphans is storage associated with a retired system or application that's never been reclaimed. This is a process problem and may represent a significant, low-impact opportunity to recover capacity. Ideally, the delta between allocated and claimed should be near zero.
ASSIGNED: Claimed storage is assigned to servers and presented to apps as volumes and file systems. The gap between claimed and assigned is another area for improvement.
WRITTEN: This is the assigned storage that contains data. The efficiency of written to assigned depends on the application and relates back to the accuracy of the initial storage request.
An executive sees storage as a IT budget line item; a storage admin sees it as frames and LUNs allocated to systems; and a system admin sees storage as volumes and file systems. When we speak of utilization, whose perspective are we taking?
Each view must be considered. When a storage manager says he's targeting 70% utilization, which two levels are being compared? How does a system admin arrive at a utilization metric of 40%? When an executive complains that storage utilization is only 7%, what's his comparison based on? They're all looking at the same environment, but the storage manager is viewing logical vs. allocated storage, the system admin is seeing file-system utilization numbers (assigned vs. written), and the executive is comparing actual data written to physical storage purchased. All of their assessments may be accurate, and each perspective provides an important piece of the consumption puzzle.
It's all about efficiency
Rather than utilization per se, we should focus on storage efficiency; each transformation level represents an important component of an overall efficiency metric. Efficiency targets should be established at each level to ensure these goals are met. The goal is to minimize the light-colored parts of the bars on the chart wherever possible.
This chart highlights the primary responsibilities at each stage of storage allocation:
Source: GlassHouse Technologies
The capacity reduction resulting from data protection policies (physical to logical) is owned primarily by the storage architect, who must translate business data protection requirements into standard storage configurations that support those policies.
The storage admin is responsible for allocating storage based on user requests, while the system admin assigns and makes storage available to applications. Unclaimed storage is a handoff issue between storage and system admins, and each must bear some responsibility in the process. Finally, file-system and database utilization may be the shared responsibility of the application owner and the system admin depending on the environment. These responsibilities can vary greatly from one organization to another.
Other functional areas can play a significant role, too. For example, the speed of the purchasing group's acquisition process directly impacts the reserve of allocated storage maintained by a storage administrator, so purchasing should share some of the responsibility for meeting target levels. What are the appropriate target levels for each category? Unfortunately, the classic consultant's answer, "It depends," applies. However, this graph shows some sample targets:
Source: GlassHouse Technologies
Clearly the physical-to-logical gap is policy based, and will have different values depending on the storage tiering structure. The allocated-to-claimed ratio should be close to zero in most cases. The remaining areas depend on organizational variables and the time impact of things like the change management and purchasing processes.
Assigned vs. written represents traditional file-system utilization, which brings us back to where we started--the basis for the initial storage request. The more accurate the size estimates here, the lower the overall multiple.
Gathering and reporting the efficiency information is also a challenge. The data is typically dispersed throughout the organization and no single tool currently provides it all. Developing a mechanism for collecting these metrics will likely require a cross-functional effort. It may not be easy, but it's essential to tame the storage consumption monster.