This article can also be found in the Premium Editorial Download "Storage magazine: Best storage products of the year 2002."
Download it now to read this article plus other related content.
|Four tips for SAN scaling|
That's what we learned at Intuit Inc. when we decided to implement a SAN three years ago. With business requirements and storage doubling - sometimes tripling - every year, the advantages of achieving greater storage resource utilization through centralization, consolidation and availability were incentive enough to go ahead and be one of the early adopters of SAN technology. As our SANs grew from around 20TB, 128 ports and 60 DLT tape drives to approximately 200TB, 900 switch ports and 140 DLT drives, we encountered unforeseen problems that can plague you if you're not prepared.
One of the challenges was sharing SAN resources and achieving 100% utilization while trying to avoid both high costs and a large team to manage the SAN. We also had to figure out how to protect our initial investment while expanding - you don't want to have to throw out the infrastructure you built when you were relatively small in order to expand.
You can avoid these landmines by not boxing yourself in with a SAN design that can't scale effectively. Understanding what that means concretely, however, is far from obvious.
|The right stuff|
Avoid false economies
Our SAN implementation began as a few isolated monolithic and modular storage arrays with redundant fabrics made up of a few switches. We had mostly Unix servers of mixed operating systems connected to the monolithic arrays while Intel servers were connected to the modular arrays. While the modular arrays were less expensive, at the time they didn't yet have the availability, caching and ability for multiple mirror copies and therefore were mostly used for the smaller applications such as databases running on Intel platforms. Although this changed over time as the software for the modular arrays became competitive with the features of the monolithic arrays, we continued to use the monolithic arrays for the most critical applications.
Ultimately, decisions at a higher level forced us to move to a method for replicating data, which meant moving more apps to monolithic storage and scrapping some of the modular systems. Always try and anticipate your future needs when you chose your primary storage (see "Scaling backup,").
Some of our initial SAN implementations were performed by adding Fibre host bus adapters (HBAs) to the servers and then migrating the direct-attached servers from maxed arrays to new ones with Fibre switches placed in between. These isolated SAN islands were designed and laid out in a simple fashion. Management was manual, but relatively easy. A few Excel spreadsheets showed the switch and disk configurations for each of the servers. The infrastructure was nothing more than several strands of fiber laid throughout the data center under the floor in the network trays. Switches were racked and located centrally between the servers and storage. Backups were performed on a daily basis. Soft and hard zones were configured and our SAN implementations were a success.
This initial configuration worked well while things were relatively small and isolated. However, some of the benefits of a SAN weren't being fully utilized in this design. New servers were added to these SAN islands, but once we grew beyond the capacity of the switches or arrays in the initial design, the various components of the SAN started to become obstacles. One by one, each component needed to be addressed.
Design your SAN with a topology that scales regardless of how small you initially start out. Spending money up front will save you both soft and hard costs down the road.
The soft costs saved include the amount of time it takes to manage, redesign and then implement a core-edge topology later down the road. The hard costs are the longer investment protection over time of the initial hardware purchase.
You can protect your initial investment if you correctly anticipate faster hardware speeds for tape, switch, servers and storage. With speeds increasing and the ability to create trunks between your switches, you'll have more flexibility if you've designed an architecture that lets you move the older and slower technology out to the edge and implement the new faster hardware at the core. You'll also reduce the amount of downtime you experience in the future. With our SAN islands, we had to bring down the fabric in order to merge islands - and perhaps the servers as well - to bring firmware levels in sync, or upgrade them to the latest version to support more or newer drives.
We found that it was better to schedule downtime when making most major changes to a SAN. Due to the infancy of SAN technology at the time, interoperability along with older versions of software firmware and drivers could - and had - resulted in unplanned outages. Ensuring data integrity and uptime to our customers was the main objective, and therefore, scheduling the downtime for maintenance was sometimes necessary.
SAN maturity and issues with interoperability have improved, so you may not need to bring everything down to make changes now.
But without the right architecture, you may be forced into awkward configurations just to utilize all the available resources across multiple islands. In our case, we would sometimes end up with small switches linked in daisy-chain fashion to each other through a single ISL in order to achieve this. As a result, our SAN was vulnerable to single points of failure.
This was first published in January 2003