The best way to scale SANs

SAN configurations greatly influence their scalability. There's no perfect model, but here are the trade-offs involved with each major option.

The consolidation of servers and business units is a tsunami crashing on most IT departments today. Consolidation and conservation of storage resources is rapidly becoming a focus, but how best to scale up today's storage area networks (SANs) to larger and more utility-like storage networks is unclear.


Option 1: A meshed SAN

A meshed SAN is easy to create, but doesn't scale past a few switches because inter-switch links consume ports.

If you interconnect small switches into a meshed fabric (see "Option 1: A meshed SAN" on this page), the number of required inter-switch links (ISLs) grows quickly with the number of included switches, limiting the number of ports available on the fabric. In fact, with five 16-port switches, each with a single ISL, 25% of the available ports in the fabric are used up. Meshing 17 switches would require every port in the fabric.

Instead, most SAN equipment manufacturers and consultants are recommending a so-called core/edge design, placing a few high-capacity switches--often called directors--in the center of a star topology. This conserves ISL ports, requiring just one per edge switch, and is quite scalable. Many vendors encourage multiple ISLs per edge switch, but even this scales well. Rearranging a five-switch mesh into a star with dual ISLs drops the number of ports used for ISLs to just 13% of the fabric, and it remains there as more switches are added.

The true core/edge design (see "Option 3: A core/edge SAN") places all devices, whether hosts or storage, at the edge, and may even segregate storage devices to one side of the star and hosts to the other. This pure design is easy to scale and enforce because ports are reserved for hosts and others for storage devices. Since each host switch is equidistant from each storage switch, the exact switch used isn't important.

The weakness of this approach is that it requires two hops through the fabric per connection, introducing latency, which is far more critical to storage than in more traditional networking arenas. And even today's blade-based directors don't scale infinitely. Eventually, another director will have to be added in the center. Once this happens, the inherent equality of all edge switches will falter because some will be connected to one director and some to another. The placement of devices will begin to matter to achieve maximum performance. Congestion at the director-to-director ISLs and latency differences between two hops and three might make performance unpredictable, even with multipathing software such as EMC's PowerPath or Veritas' DMP.

You can reduce the number of hops while still maintaining scalability by connecting all storage devices directly to the core directors. That reduces the hop count to just one and puts the focus of the SAN on fan out from storage devices, rather than pretending that all devices--storage and host--are accessed equally. But this also adds more connections to the core directors, astening the time when all director ports are used and another has to be added. Although such a core resource design might perform better than a pure core/edge design, it won't scale as well.


Option 2: A collocated SAN

A collocated SAN with devices near the hosts they serve works well, but can be hard to keep tidy. On the plus side, it's a fairly quick way to connect multiple islands.

A really interesting alternative approach is the collocated design (see Option 2: A collocated SAN" on this page). This intermingles storage devices and host connections on the edge switches, keeping paired devices as close together as possible. This design scales like a pure core/edge design because no devices are placed on the core, but can perform much better since resources are kept close to their users.

Yet you can have difficulty maintaining this configuration over the long term. Edge switches are usually not expandable, so ports on every edge switch will have to be reserved for future use. And a last-minute connection of a device far from its user will tend to stay put for the long term, reintroducing the latency and ISL congestion concerns the collocated design was supposed to eliminate. The real benefit of the collocated design is that, by definition, SAN islands are already collocated. Simply adding a director to interconnect the far-flung fabrics in a data center instantly transforms it into a collocated SAN.

No matter what architecture is chosen, one critical design element should always be in place: dual redundant fabrics. The interconnected enterprise SAN is the seawall around your entire organization's data. If a SAN island sinks, it takes down only a few applications. But an enterprise SAN failure is a disaster of Atlantean proportions.

The best way to protect yourself from this nightmare scenario is to create two mirror-image fabrics and connect every host and storage system to both, using pathing software to maintain connectivity if one SAN should fail. And even though today's virtual SAN technologies can provide some extra protection from fabric failure, do not rely on a single piece of hardware to provide both paths--use two "air-gap" separated fabrics.

The best-practice core/edge and collocated SAN designs can be put into place slowly, as part of an evolution from a chain of isolated SAN islands to a unified SAN. The hardest part is maintaining a logical architecture in the face of constant change.


Option 3: A core/edge SAN

Arranging SAN equipment into server sides and device sides connected by a director, guarantees scalability, but introduces intra-switch hops. Eventually, another director will be needed, adding more hops.

The first step is to decide where to place the existing switches in the new architecture. Most sites facing this task do not already have director-class hardware, so the current switches can become edge switches in the new environment. But take a hard look your hardware:

  • Consider moving up to newer switches if yours are more than a year or so old.
  • Unless you require arbitrated loop connectivity, now is the time to retire your Fibre Channel hubs.
  • Avoid heterogeneous SANs. Pick a single vendor for the new SAN, and swap out other equipment.
Review the compatibility of firmware revisions running on the switches to be consolidated. There may be special compatibility mode settings to make different switch types work together. Make sure identifiers such as domain IDs and zone names are unique on all the switches. Otherwise, the SAN will segment.

Implementing a segregated core/edge design requires reconnecting storage and servers to their respective sides of the director. Collocated SANs require little reconnection, but be careful to reserve a few ports per switch for the future.

So the good news is that your current SAN islands can probably be interconnected without tearing everything out and starting over. But double-check the compatibility of your hardware and SAN services. Retire your outdated SAN devices. And get ready for the pressure of guarding your enterprise's crown jewels.

Dig Deeper on Storage management tools

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.