For many organizations, the practical challenge of data protection comes down to surmounting management reluctance to spend money on a capability that, in the best of circumstances, never needs to be used. This is less the case in industries such as finance and banking, where organizations are compelled by both business requirements and federal regulations to develop and maintain a recovery capability.
However, even in the nation's largest banks, pressure is on storage managers to drive costs out of data protection wherever possible. In at least one case involving a major banking institution, this requirement has led to an interesting approach -- sort of a "management by TCO comparison" -- that may be worth emulating in other cases. The name of the institution where Phil, a storage manager, is employed has been intentionally withheld from this column in consideration of corporate sensitivities.
According to Phil (who recently shared his story with me), a decision was made on the heels of a significant merger in 2000, to standardize "on one disk vendor, one tape vendor, one FC fabric vendor and one backup vendor" in order to save money and drive costs out of the storage and data protection infrastructure.
"It was believed," Phil recalls, "that the operational efficiencies gained by technological homogeneity and through bulk license acquisitions would yield significant financial benefits over the long-term. These core technologies formed the foundation of [our conceptualization of a] Storage Utility. It was also decided that, as operational risks diminished, additional disk platforms and, if necessary, additional disk vendors, would be added to the Utility. But for as long as it is deemed appropriate, 'keeping it simple' would be our primary architectural guideline."
This philosophy led the company "into the Fibre Channel fabric business in a big way and in a big hurry" in 2001, says the manager. Encouraged by business case analysis, which suggested that sharing tape drives and disk subsystems would provide significant financial benefits over the traditional direct-attached model, he sought ways to measure the success of various techniques slated to be rolled out over time to minimize cost while increasing storage and data protection efficiency.
"At the time ," he recalls, "every server was deployed with at least one dedicated tape drive and many servers had multiple tape drives. This configuration model yielded an average tape drive-to-server ratio of 1-to-.57. Today, in the shared Tape Utility, we have an average of 1-to-1.3. In other words, our tape drive efficiency improved 228% as a result of employing tape library / tape drive sharing in a FC fabric. Over the course of the past two years, 23 Storage Utility locations have been plumbed with a FC storage network and as new servers are deployed, they are plugged into the fabric for both storage and data protection services."
He quickly adds that his job did not stop with the specification and deployment of a new data protection infrastructure, "[We understood that] mismanagement of these new assets [over time] would have more than an inconsequential bottom-line impact." This observation led him to implement an annual Total Cost of Ownership study that enabled him to track any progress that his strategy was making toward reducing costs.
Phil secured a TCO modeling tool from a leading research & analysis firm and took a full six months to derive baseline utilization statistics about his infrastructure. "One of the reasons it took so long," he explains, "is that there were numerous asset tracking systems from the past several mergers that were not yet integrated. Trying to normalize lease, depreciation and maintenance schedules for thousands of assets is a task no sane person would relish, but it had to be done in order to provide a common costing formulation for all acquisitions for all banks for the previous five years. One gags on the minutia. One suffocates within a mephitic swarm of paperwork. One tears at one's own skin at the thought of having to review yet another Excel formula with 63 referential variables. I assure you, Dante missed a Circle on his little road trip with Virgil."
Factors that were measured and included in his TCO analysis included:
"1. All personnel costs (salary, benefits, overhead) for any associate engaged in any storage management activity -- even if only 1 hour per week -- were captured. The total hours of storage management activity were then multiplied by the relevant personnel costs to determine total personnel costs and total full-time equivalent employees (which, in our case, was 3 times the number of associates who were actually employed by my storage management organization). For instance, a DBA, tape operator, a supply chain manager or an electrician are involved in some aspect of storage management, from procurement to installation to production support, but none of these associates are employed by my organization."
"2. Floorspace consumed by disk, tape and I/O infrastructure assets was actually measured in the data centers. Floorspace consumed by servers with internal disk was calculated as one-ninth the floorspace consumed by one rack. Floorspace costs for each data center were individually calculated based on the charge-back model for that geographic area."
"3. Power and cooling costs for disk, tape and I/O infrastructure assets were calculated based on the vendor's specs. Power and cooling costs for internal storage was assumed to be 20% of the power consumed by the specific host. Power and cooling costs for each data center were individually calculated based on the chargeback model for that geographic area." "4. For those servers with internal disk and tape, representative invoices of standard configurations were used to develop a costing assumption for storage. This cost would vary from company to company, depending on their server builds. The goal is to establish a standard cost assumption for internal storage for server class X as a percentage of the total server cost and then multiply this cost assumption by the number of servers deployed in that class.
For instance, a company may find after studying their invoices that their standard Windows server cost is $ 7,000 and that 40% of this cost is associated with internal disk and tape drives. In this case, $ 2,800 is multiplied by the number of servers in this class and then divided by the lease schedule to yield an annual cost. Maintenance cost assumptions are handled in the same manner using the same proportions."
"5. For disk, tape and I/O infrastructure assets external to servers, the actual lease/depreciation/maintenance schedules were used. This was an incredibly tedious and time-consuming process, but upon completion, it lent a great deal of credibility to the final numbers."
"6. Any storage management software lease/depreciation/maintenance costs are counted."
"7. Any cost associated with personnel support is counted: offices, supplies, telecommunications, training, etc."
"8. Tape media costs are included in the TCO."
"9. Consulting fees, off-site storage costs, business continuity costs are included."
"10. No desktops were counted ... only mainframe, midrange and distributed servers."
Phil adds, "There really isn't anything magical about the formulation. If a cost is related to storage, add it in. The success of the TCO study is directly proportional to the commitment of senior management to determine the real TCO. It's a lot of hard work for everyone involved and the project manager has to be relentless in pursuing accurate and complete information."
Armed with baseline data, Phil says he has been able to compare results year-after-year to demonstrate the efficacy of the parts of his data protection and storage strategies that unfolded over the previous 12 months. In his words, "Once the initial study establishes the TCO baseline, it's very simple to demonstrate quantitative progress with year-over-year comparisons. Our senior management has challenged us with an expectation of a 30% annual decline in the TCO per utilized MB for the mainframe, UNIX and Windows platforms. This objective of course has to be balanced against customer satisfaction and other non-financial metrics like availability. It doesn't do anyone much good to have the world's lowest Storage TCO if none of one's customers can find their data. But having a trusted, systematic methodology for demonstrating prudent stewardship of corporate assets should leave more time for focusing on the more qualitative facets of the discipline."
Phil stands by the value of TCO comparisons as a metric for demonstrating the efficacy of data protection and storage architecture strategies. He says that the TCO study is not only a report card on management effectiveness, but also "a treasure map suggesting places to look for new expense lowering strategies. Insights into opportunities that would otherwise go unnoticed can be found in the arcane details of these results. For large organizations, one of the most productive tasks to undertake would be to complete an annual Storage TCO study. I heartily recommend it … especially to those I dislike."
Thanks to Phil for this insight. If you have a method or technique you would like to discuss, please contact this columnist at firstname.lastname@example.org.
About the Author: Jon William Toigo has authored hundreds of articles on storage and technology along with his monthly SearchStorage.com "Toigo's Take on Storage" expert column and backup/recovery feature. He is also a frequent site contributor on the subjects of storage management, disaster recovery and enterprise storage. Toigo has authored a number of storage books, including Disaster recovery planning: Preparing for the unthinkable, 3/e. For detailed information on the nine parts of a full-fledged DR plan, see Jon's web site at www.drplanning.org/phases.html.