Mike Taylor's reaction to the idea of storage service level agreements (SLAs) is typical: They haven't hit his high-priority list yet, but he suspects that will change fairly quickly. "We really don't have formalized service levels for storage at this point," says Taylor, a storage administrator at Capital Blue Cross in Harrisburg, PA. "But I do see them coming."
Storage hasn't historically rated separate SLAs--with the huge exception of disaster recovery but Sept. 11 cast disaster recovery plans into high gear, and senior executives started paying attention. Moreover, storage technology has steadily grown more complex in the past several years, causing IT departments to consider the value of creating storage-specific SLAs.
Then there's the influence of storage service providers that use SLAs as part of the contracts they sign with clients. "The use of storage SLAs really accelerated in the late '90s as SSPs started selling," says Al Sporer, the global practice manager for operations management consulting at EMC. "We've seen the concept start to move to internal IT departments in the past couple of years."
|What rates an SLA?|
Most importantly, CIOs and storage managers view storage SLAs as a tool to rein in storage spending. As the economic downturn drags on in many industries, CIOs need to cut costs. Storage--which many experts say can take up to 50% of an IT department's capital budget--has long fallen into the murky category of "necessary, but hard to track financially."
Small wonder, then, that storage expenditures have come under increased scrutiny. "Executives are beginning to ask why the bean counters can't account better for storage spending," says Dan Tanner, an analyst with the storage research group at the Aberdeen Group, Boston, MA.
SLAs aren't wholly about service--they're also about money. "If you apply cost to a policy, you've got the beginnings of an SLA," says Mark Friedman, the vice president of storage technology at DataCore Software in Ft. Lauderdale, FL.
Although corporate interest is being piqued by the notion of storage SLAs, issues clog the road to smooth implementation. It's all very well to sign agreements specifying applications will be backed up nightly, or disk space allocated within 48 hours of a request, but what's the point if IS can't fulfill the requirements of the SLA? For example, Jerome Wendt, a senior storage analyst at First Data Resources in Omaha, NE, envisions storage as an in-house utility. Regardless of what SAN your server is plugged into, "you should be able to call us and get storage," he says. But he can't currently guarantee this is going to happen, so won't make promises.
For storage SLAs to have teeth, they must work properly. The first step is to find out what the main challenges are in implementing storage SLAs. And then know what to measure and what services make little sense to measure.
At many companies, "If there are SLAs in place, they are so generic that they're useless," says Steve Duplessie, founder and senior analyst at Enterprise Storage Group, a storage research company based in Milford, MA. Tanner agrees. "Companies using any sort of chargeback are using a more or less gross measure, maybe megabytes or gigabytes," he says. The end result is an SLA with no bite. He says, "If the backup SLA is, 'OK, I'll back stuff up,' what does that really mean? Users want to know that at x period of time--no matter what happens--they can restore applications within a certain level of time."
By making SLAs more granular, it's simpler to tie specific costs to each service level, which means that storage managers can give users better information about storage costs, and users can make more informed decisions. Duplessie gives a hypothetical example: "Let's say a department has 3TBs of data sitting on an IBM Shark. But in six months, that data hasn't been accessed. Does it really belong on the most expensive level of storage? The SLA should be written at a level of granularity to address this." For example, "If data isn't accessed within x months, IS can move it off the most expensive storage and put it on less expensive stuff."
At MasterCard International in New York City, for example, IT operates on a chargeback basis. By sticking with just a few storage vendors, Jerry McElhatton, the president of global technology and operations, says that lower costs lets him offer better service level agreements. "We provide a service and have service level agreements," he says. "And by having volume discount on storage purchasing we were able to eliminate multiple data numbers." That, in turn, made it easier for the group to stay within acceptable windows of service.
|Who cracks the whip?|
It's an interesting conundrum. The people who measure for SLA compliance are the same people being held accountable for them. Most experts say, however, that the system works if both the user side and the storage side are working towards the common goal of driving down costs. For example, if storage services operate on a chargeback basis, storage managers are probably working towards specific cost cutting goals. It's therefore in their best interest to meet the SLAs, says Mark Friedman, the vice president of storage technology at DataCore in Ft. Lauderdale, FL. "You enforce SLAs best at the level of the cash register," he says.
Jean Banco, a product manager at Fujitsu Softek in Sunnyvale, CA, says the buck stops at the application manager. Here's why: "The application manager generally is the one to set service levels for the storage manager on the automated tools," she says. "They set up the storage requirements, the performance requirements, recoverability, security--the business-centric side of things." It's then the storage manager's responsibility to map those requirements to the physical storage environment, and the application manager is the one who tracks compliance.
SLAs that measure the wrong things
Router availability, storage availability and backup: All of these things are commonly found in SLAs, but storage managers who use them as the basis for SLAs are using the wrong yardstick, says Raymond Paquet, an analyst at Gartner Group. "At the end of the day, an SLA is composed of three words, and 90% of IS groups forget the first one," he says. "Most people define a metric and think that's the definition of the service. That's wrong. The first part is defining the service."
The bottom line is storage services should be things that are meaningful and impact the business. For example, "Storage availability isn't a service," he says. "As a business user, I don't care if storage is available or not. I want the application to be up."
Paquet advises IS groups to continue to measure and monitor such storage benchmarks, but only as internal measures. "I'd use them to hold the storage management group accountable within IS," he says. "The availability of underlying components are interesting to IT, but not to the business." So what would Paquet build storage SLAs around? The things that business people will come directly in contact with, he says: recovery and provisioning.
Richard Scannell, the VP of corporate development and strategy at GlassHouse Technologies, Inc., Framingham, MA, says SLAs are frequently unenforceable because storage managers don't implement policies and procedures spelling out how to accomplish the SLA. "No technology by itself can produce SLA compliance. It's how it's used by people, and a big problem is not defining processes and procedures that support SLA compliance," he says.
The problem, according to Scannell, is that it's difficult to define processes--"to say, here are the things to measure, and here's how to translate them into capital, expense and human requirements. As a result, the SLA is the equivalent of giving the CIO enough rope to hang himself with."
Mapping an SLA to reality Every line business manager will be the first to tell you their application is absolutely mission critical, but the truth is there will always be some systems that are more vital than others. But many application SLAs don't take that into account, and instead assign service levels that are overly high for a given application.
The trick is to find a way to get users to agree to lesser levels of service. Like Wendt, Scannell suggests to make the conversation about money. "Pass some of the responsibility back to the customer," he says.
Once a SLA is agreed upon then the fun begins: measuring its compliance. There's a bunch of SRM tools on the market that do a good job of collecting data from individual storage devices. However, Gartner's Paquet says, "There's no way to aggregate the data," adding, "Without a common data structure, how do you correlate the data?"
Wendt agrees, saying the heterogeneous technology environment of open systems storage makes it almost impossible to find a tool that will measure service level adherence across the enterprise. Wendt envisions software running a management console that lets him build customized data collection metrics on various aspects of the storage infrastructure. "I could have a script running in the background that would tell me when I was about to go out of the window of acceptance on a service level, but right now I don't have any automated way of collecting this data," he says. "A lot of measurement is still seat-of-the-pants, and that's no way to measure an SLA."
Aberdeen's Tanner says, "Tools managing things like storage, network resources, file services, database services are extraordinarily complicated, and there's no tool that does it all. The more sophisticated they are, the fewer vendors they work with." He says companies on the leading edge of storage SLAs have built their own tools.
Some of the better tools are coming from SSPs, many of which are struggling financially. According to Tanner, "SSPs have discovered that companies don't want to outsource storage, but they all have developed software that will let companies manage storage internally like an SSP." This includes being able to measure SLA adherence.
Sooner or later, most companies are moving to storage SLAs. First Data's Wendt says, "SLAs are definitely coming, and I need to start thinking about this."