News Stay informed about the latest enterprise technology news and product updates.

Storage Decisions 2001 Preview: Achieving perfection in uptime

Achieving "five 9s" is the ultimate goal in terms of availability. By definition, five 9s means a system is running 99.999% of the time. Or, if you're a "the cup is half empty" kind of person, your system is down for 5.39 minutes -- total -- a year. Doesn't seem like much, but for shops running mission critical apps, the thought of any downtime is enough to make them blanch. Still, there's a lot of misconception surrounding Five 9's among even the most sophisticated and experienced IT pros. It's a tough goal to achieve, let alone figure out, but Marc Staimer, president and self-dubbed "Chief Dragon Slayer" of Beaverton, Ore.-based Dragon Slayer Technologies has practical tips for those ready to take on this fire-breathing problem. Staimer, who'll be sharing his insights on high availability during a presentation at searchStorage's Storage Decisions 2001 conference next month, talks with Site Editor Michele Hope and reveals how users can achieve high availability without getting burned.

Can you give an example?
Sure. Servers themselves are one of the most expensive pieces on which to engineer five 9s. IP (Internet Protocol) makes the network relatively easy to dual-connect separate routers, separate switches, so that you have access. There's a different connotation to five 9s when you talk about networks. When it comes to network performance, are you talking about five 9s as the same network performance at all times or allowing for reduced performance? This affects how you engineer as well. These are questions most people don't ask themselves.

On the back-end network, with the SAN, it is the same thing. You could set up your storage access so that you can have 100% availability and 100% bandwidth at all times. Ask yourself the question: Do you need that for everything? If you are trying to tackle achieving five 9s on any one of those four subsytems you mentioned earlier, how would you go about it?
There are different criteria for each of those subsystems: storage, storage area networks, servers, or networks. The fact is, some of these are relatively easy to engineer for five 9s, but some of them are also very expensive to engineer. How realistic and achievable is five 9s of availability for today's IT organization?
It is. You can structure your environment to meet it. But, there are three aspects to your environment that people don't think about (actually four,) that you really want to take into account ? server, application environments, storage networks and storage itself and the network.

Each one is a separate set of systems. If you just build five 9s for one, it doesn't mean you've built five 9s for all of them. Most people focus on achieving five 9s on the server. That's one piece -- but if the storage is not there, you're still down. Now, if you build all four areas as five 9s separately, that doesn't mean they are all going to work as five 9s synergistically.

What you really need to look at when you start engineering or designing your systems (or redesigning existing systems), is where you most need to achieve five 9s. Not everything needs to be at five 9s. That's where available budgets come in. Typically, people focus on this level of availability for certain systems only: 'Hey, I need my e-mail all the time.' or, 'I need my database up.' Or, 'I need my e-commerce application up and running at all times.' That's fine. You have to select which five 9s -- rather than do it all -- because it's going to cost you too much to make everything five 9s. What do you hear from users about their goals of a high availability IT infrastructure?
It depends on the company. You talk to a bank, and they are very into the classic, mainframe definition of availability: always up, always available, never degraded. That's their view of five 9s. This is the traditional viewpoint from the IT data center. They can't afford to have downtime in their systems. When you talk to airline reservations companies, they say the same thing. Now ask them these questions, 'Which of your systems can you afford to be degraded? Never mind being down, what about degrading? How does that affect your calculation?' There's a rule in any engineering discipline, regardless of whether it's civil engineering or high-tech engineering. The last 10% will cost you 90% of your dollars. So, now you have to ask yourself, 'What is good enough for different things?' It may seem easy to say, 'If I just make sure I have 100% availability all the time, I won't have to worry.' That's true, if cost isn't an issue. But, we're in an economic downturn where cost is an issue. Are there any good materials you've come across that provide a framework for how to approach the quest for five 9s?
I've checked around in my own research. There isn't very much out there. You see a lot of information on high availability on the Web, but it tends to be more general. It doesn't take a complete view of the IT infrastructure. The most information out there on the subject tends to be very vendor-specific and vendor-influenced. Let's take the server world. You've got the System 390 (now z900) view of five 9s. You've got the server view of five 9s. You get the database view of five 9s. You have the IT network view of five 9s. You have the storage area network (SAN) view of five 9s. You have the storage view of five 9s. By the way, they're all different. Should we start calling you Saint Storage?
Yes, that's right! As I explain to my Asian colleagues, though, it's the European, medieval dragon we are trying to slay-- not an Asian dragon. Asian dragons are good luck. This is the nasty kind of dragon that represents tough problems.

Related Links Storage Decisions 20001 Figuring out the metrics of 'five-nines' You have a job ahead of yourself, setting the proper framework up in which people should be viewing the quest for five 9s in their organization.
Right. They have to get to the point where they don't depend solely on a vendor. They need to depend on themselves in this effort. They need to know what they've got. They need to bite the bullet, and it's no fun organizing. There are no shortcuts to knowing which data is important.

Basically, you're asking the question, 'What can we live with? What level of downtime can we live with in this system?' That's the question you MUST ask. You must understand the scenarios you can't live with, when you would be out of business? Where do you begin? How do you get started?
Believe it or not, the storage part of the world is the tail that's wagging the dog. When all your data resides on the storage, think of it this way. I don't care how big your server is, if you can't access your storage, you're down. How do you break the quest for five 9s down into manageable chunks so that it doesn't seem like such an overwhelming, complicated task?
Basically, I would deal with the most critical components first -- from the back-end to the front-end: storage, SANs, servers, applications, networks. You need to look at which data MUST be available all the time and which data needs to be available all the time with full bandwidth. Before we finish, I have to ask you for the story behind the name of your company, Dragon Slayer Consulting, and your own title of Chief Dragon Slayer.
I'm president and CDS (Chief Dragon Slayer). Being a CEO seemed a bit much for a one-person company. Instead of chief executive officer, it seemed better to call myself chief dragon slayer. It's basically about coming up with the right answers in my career. People who worked with me had often told me I had a knack for getting to the right answers. A couple people named me the Dragon Slayer, a la Saint George.

Dig Deeper on Data storage strategy

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.