I have a message for all of the Tek-Tools, Computer Associates, AppIQ, EMC, Symantec, HP, IBM and all the rest of the 271 dabblers in the black arts of storage management: There are still a few simple data points that everyone seems to need, but no one seems to be getting from your products. Maybe this tip will help you to get your product development efforts pointed in a truly useful direction.
The other day, a client asked how he could get a handle on actual storage capacity growth in order to project storage requirements in his shop. He complained that he needed to go through so many gyrations just to get capacity allocation information out of his Network Appliance and EMC gear that he was seriously considering leaving storage altogether and joining one of the many troupes of the Cirque de Soleil that seem to be popping up at every hotel in Vegas.
"It seems like the vendors don't want us to know how much storage we are using. Or how much we have right now. Or how much they are wasting with all the value-add that they are giving me using my disks," lamented the would-be trapeze artiste. "In the meantime, I can't give management a straight answer to a simple question: How fast is storage growing?"
Responding to the fellow, I borrowed a page from a good friend of mine (and fellow industry troublemaker) Mike Linett, CEO of Zerowait, Inc. in sunny Newark, Del. I told the guy to "treat storage capacity like any other inventory." It's a simple idea that somehow sounded much more profound when Mike said it to me:
"Say that you have an inventory of widgets and you want to know how fast you are using them up. You simply take inventory at routine intervals, trend it over eight weeks or so, and -- voila! You have the storage usage info that you're looking for." It's easier than doing a triple back-flip dismount off the high wire.
Linett currently has his guys developing a simple tool for performing this feat with the NetApp gear used by his customers. He says he will shortly be offering it on his Web site. From where I'm sitting, we need similar capabilities for all of those other house-of-mirrors arrays whose vendors seem to be doing all they can to obfuscate efforts at capacity allocation measurement.
For those who want more detailed information on what is actually using up all that expensive disk, Linett says he is adding capabilities to sort contents by file extension. You'll know how much of your capacity is being eaten up by Microsoft apps, how much by downloaded BitTorrents, and so forth…another good idea.
I think there are a few more tweaks that the Zerowait folks could add as well. For one, I'd like to know how much storage each employee is using, or maybe all of the employees in a given department, or all the departments of a given business unit. That way, you can see who your storage hogs are.
I also think it would be great if someone could find a way to relate actual active storage growth to real inactive storage growth. People are always saying that you need eight to ten times as much inactive (backup, disaster recovery, data protection) capacity as you do active (production storage) capacity. Is that number real or just a bit of sleight-of-hand from the tape-and-mirroring crowd? I want to know!
Heck, I want even more: I want Mike to relate map his storage capacity statistics to some cost-of-ownership information. Take the value of the storage asset (usually a capital investment depreciated over time) and divide that by capacity to come up with cost per gigabyte (GB) to store data on that boat anchor of a Symmetrix or TagmaStore. Don't trust those funny numbers from vendors, do the math yourself.
Truth be told, many vendors say that they are working on interfacing their storage capacity reporting tools to back-end asset management systems. That will be a great thing once they get around to it. But I'm sick of waiting for all the relationship building between storage guys and asset management software guys to deliver results. We can get there today, especially if Mike were to put in some data entry screens where consumers can enter in the cost data from their hardware invoices, and tick off the appropriate depreciation method. Even Excel can calculate the result.
Oh, and be sure to add in the cost of all of that legacy Fibre Channel (FC) infrastructure (HBAs, GBICs, switches, cabling, etc.) that customers are talked into deploying just to make their lives a living hell. Watch how $50/GB FC disk accelerates quickly to $180/GB FC fabric storage! How much is it costing consumers to put up their data at the Hotel Fibre Channel? Probably a lot more than they think!
I'd also like Mike to supply some fill-in-the-blanks spots for adding in soft costs: array software licenses, environmental costs (power, etc.), management and tech support costs (just multiply rough percentages of admin time by administrator salary or hourly compensation), and maybe even some pain-and-suffering expense that all those typical TCO models tend to overlook.
By the way, the result will be an all important component of any worthwhile Information Lifecycle Management (ILM) scheme. To write good ILM policies, you need to map data to appropriate infrastructure from a performance and cost perspective. Most of the self-styled ILM vendors aren't giving us any tools for characterizing storage platforms, probably because most of them also manufacture storage platforms.
This is all pretty basic stuff that has been, for some reason, overlooked in the management tools that are out there right now. I haven't even asked anyone to solve the problem of data naming, so we can figure out what to move in our ILM scheme, or to come up with a working access frequency counter, so you could know when to move data from target to target over time. For your information, you might be able to do that by taking the proctologist's view: Look at your backup logs and do some reverse engineering to see what data is being changed and how frequently.
For more information:
About the author: Jon William Toigo is a managing partner for Toigo Productions. Jon has over 20 years of experience in IT and storage.
Dig Deeper on Data management tools