Published: 10 Nov 2003
Five years ago, the first Fibre Channel (FC) storage area networks (SANs) began to be deployed in production environments. At that time, the connectivity options were few, consisting primarily of eight- or 16-port departmental switches and 32-port director-class switches. Because these early SANs served primarily as SCSI-interconnect replacements in environments connecting a relatively small number of servers and devices, this limited expandability wasn't a major problem.
|How big is too big?|
These initial SANs validated the touted benefits of networked storage. As the growth in the number of deployed ports and switches attests, most users have made SANs an integral part of their storage strategy. Benefits of SAN technology are derived from these three key advantages relative to the previous parallel SCSI technology:
- Enabling of increased distances for connectivity
- Ease of modification and expansion
- Support for a relatively large numbers of devices
Users typically begin to encounter greater challenges when they attempt to grow their SAN environment by two times or greater, or when they consider integrating newer products to interoperate within their existing infrastructure. Two fundamental issues are managing expansion in a nondisruptive manner, and ensuring that the resulting SAN will perform as expected. Among the variety of problems commonly encountered when growing a SAN are:
- Port availability for ISLs
- Planning for access requirements to avoid excessive hop count
- ISL saturation and imbalance
- Conflicting host bus adapter (HBA) capabilities and vendor support in heterogeneous environments
- Disruption to the production environment during transition to expanded design
- Management and security constraints
- Zoning complexities and potential for errors affecting the entire SAN
So what are the design goals to consider? A scalable SAN design has these four traits: the ability to add additional hosts and storage/tape devices without reconfiguring ISLs; the ability to increase port count without increasing hop count; the ability to add nodes with minimal placement criteria and lastly, the well thought-out application of fault isolation technologies, such as autonomous regions and zoning, which prevent a single event from disrupting an entire SAN.
Currently, there's two common approaches to SAN expansion: one that can be thought of as "storage-centric" (see "Credit Valley Hospital runs out of room") and another that is "network-centric" (see "Building on the old SAN" and "Guardian Life swaps the new for the old"). Depending on the environment, choosing the right approach can have a significant impact on the scalability and cost of the SAN.
The storage-centric approach is seen most often in small (fewer than 50 ports) to midsize (50 to 100 port) SAN environments. Administrators typically purchase their new FC switches as part of a storage upgrade that also includes larger capacity storage arrays, additional connected hosts and the latest in tape drive technologies. These users expend minimal effort evaluating the different technologies available from the leading FC switch providers. Essentially, the storage systems are the primary purchase and the switch is treated as an ancillary component. In these cases, the switch recommendation of the selected storage vendor is followed with little further consideration. Because only one or two of the current director-class switches are needed to accommodate this number of nodes, the "Day 1" SAN design considerations are minimal.
In contrast, we have found that administrators of large SANs (greater than 100 ports or more than four switches per fabric) often take a network-centric approach. They evaluate, select and design their next generation SANs independently of additional server, storage and tape purchases. This is likely because their long-term requirements are less clear. They need to have the flexibility to address a wider range of possibilities. Some of the considerations that they must weigh include:
- Cost vs. flexibility: The high Day 1 costs associated with scalable multiple-switch designs. This cost is frequently magnified by low Day 1 utilization of available ports (see "Merrill Lynch's three- year refresh").
- Interoperability and heterogeneity: Interoperability between switches--even that of like manufacturers--is limited. For example, Brocade trunking isn't supported between the vendor's new model switches and its popular 2xxx series switches. Therefore, a significant consideration when planning a major upgrade frequently involves deciding whether the entire existing switch infrastructure needs to be replaced.
- ISL bottlenecks: As the port count increases, so does the aggregate fabric I/O. More I/O increases the probability of ISL congestion.
- Excessive hop counts: In poorly planned SAN configurations, expanding port counts often lead to an excessive number of hops between host and storage, significantly increasing latency. As a rule, the lower number of hops between the initiator HBA and target storage devices the better overall throughput.
- Complex connectivity requirements: Larger SANs are more likely to have multiple targets for a single initiator. These targets include multiple storage arrays and tape devices. Greater care must be taken to ensure that configuration best practices such as vendor fan-out recommendations continue to be followed.
- Extending beyond the data center: Geographical considerations introduce a host of issues, including distance limitations and extension costs, which can have a far-reaching impact on the usefulness of the SAN. Connection issues such as increased latency, the risk and impact of isolating part of a SAN and link state changes over a single, wide-area fabric can be limiting factors in SAN design.
Technology for large SANs
If you're building a large SAN infrastructure, what technologies should be considered? Vendors have been listening to their largest customers and are adding features to switch offerings that can help to overcome a number of expansion hurdles. Here are some of the most important.
Cisco Systems Inc. was the first to introduce autonomous regions (ARs) into FC switches with its Virtual SAN (VSAN) feature. ARs create independent fabric services for each set of ports that are configured within an AR. Each AR is a logically independent SAN within a physical storage network. Changes made within one AR don't impact other ARs. For example, re-zoning is one of the riskiest configuration changes that can be made in a production fabric environment (due to its ability to impact every node on the fabric). With ARs, a zone change can now be limited only to ports within a specific AR.
Today's SAN administrator typically has full access to the entire fabric. Switches are beginning to offer multiple login groups with more granular access. When combined with autonomous regions, this allows a departmental administrator to be given access to make changes only in his own AR. This added level of security makes it easier to restrict the ability of one administrator to affect the entire SAN.
Early FC switches had inefficient algorithms for routing I/O across multiple ISLs. Previously it was done on a per connection basis, with all I/O between a single initiator and target combination traveling across a single ISL between two switches. If two connections had substantially different I/O loads, traffic would be unequally distributed across the available ISLs. The latest technologies permit striping at the frame level across multiple ISLs. This increases the overall utilization of available ISLs, reducing the number of required ISLs, which in turn results in more useable node ports.
One hundred-plus ports per switch are now common on the director-class switches. Vendors such as McData Corp. are already bringing to market 256-port devices and have roadmaps for 512-port devices. These higher port-count devices go a long way toward simplifying large SAN designs within the data center. The greater the number of ports per switch, the lower the number of ISLs required, resulting in a greater number of useable ports. In addition, there are fewer design considerations when a smaller number of high port-count switches are utilized.
The ability to have an IP blade option in a switch makes it easier to expand SANs across greater distances. Previously, specialized channel extension equipment was required to create a single fabric between two switches more than 10 km apart. Today, using a company's existing IP WAN infrastructure, fabrics can be more easily expanded across multiple, geographically separate locations. This is particularly useful for data replication.
The cost of scalability
Unfortunately, designing a SAN for scalability with minimal disruption in a production environment comes at a price. The high Day 1 cost can't be justified in many environments, despite advantages in manageability.
For example, consider the problem of designing a SAN to support a future goal of 2,500 connections. The classic three-tiered SAN design (see "A three-tiered switch SAN design model") would be one way to approach the problem. Consisting of a host tier, a connectivity tier and a storage tier, this configuration enables fabric scalability to a large number of nodes with minimal reconfiguration. A three-tiered SAN could be designed around two 140-port core switches supporting 20, 140-port edge switches (with 14 ISLs per edge switch).
However, if only a few hundred ports are required initially, the economics don't make sense. The Day 1 purchase requires the purchase of two core switches plus a minimum two edge switches. The resulting cost per port would be too expensive for most organizations. In fact, it's even worse when you double the number to support an independent redundant SAN, a recommended best practice and often a requirement in production environments.
The result is that many companies compromise with a less scalable two-tiered architecture (see "A two-tiered switch SAN design model"). This design supports nearly half as many nodes as the three-tiered architecture, but requires two switches per fabric to meet Day 1 requirements. While the two-tiered design can later be converted to three-tier, significant reconfiguration would be required, causing a major disruption. Redundant fabrics can mitigate some of this disruption, but the process of planning and executing this expansion would still be a considerable effort.
As technologies mature and solutions evolve, many of today's SAN shortcomings will subside, and the trend toward larger SANs will continue (see "How big is too big?"). Just as the early adopters of SAN technology realized benefits over the previous technologies, further benefits will be realized by building larger networks.
So what should be done today to enable successful expansion in the future? If you aren't already doing so, start taking a network-centric approach to SAN design and purchasing decisions. Begin by thinking of SAN infrastructure independently of storage, and develop a set of requirements for SAN infrastructure that supports potential expansion plans. Logically decouple SAN purchases from storage purchases, and the new equipment should have the features you need. As much as is practical from a cost perspective (see "Are smaller switches an option?"), design for minimal disruption. Act now to position yourself to take advantage of it.
Merrill Lynch's three-year refresh
Larger shops plan for a complete storage area network (SAN) technology refresh every three or four years, typically the length of the lease arrangement with their existing vendors.
"All designs are short-lived, right?" says Max Riggsbee, an independent storage consultant working with Merrill Lynch & Co. Inc. in New York. "At the end of three years, there's new technology to consider and we're likely to change out different elements of the SAN."
Merrill's SAN currently supports approximately 300 hosts, with a design maximum of 500 hosts and around 1,000 host bus adapters (HBAs). "If we get up to the limit, we can add more edge switches," Riggsbee says. "But at some point, we'll hit the port density of the switches."
The SAN is based on a three-tier design that uses 122 Brocade 12000 switches, says Jerry Curtis, business manager of data management and distributed services at Merrill Lynch. "We have a single dual-path SAN fabric that allows all systems to see all the storage devices," he explains. The company's storage includes approximately 240TB of EMC storage and another 100TB in Hitachi boxes.
Although the firm "probably uses more ports for inter-switch links (ISLs) than a traditional two-tier SAN island design," Curtis says it's worth it. The costs are offset by their high utilization of storage assets and the flexibility of managing and changing the existing storage configuration. "We have around seven people managing 340 terabytes of the storage," he says. "We made decisions about SAN islands vs. three-tier, and the costs are pretty minor in comparison." Also, he says, with a traditional two-tier design, "you can have problems getting to the storage devices."
On the utilization side, "most of our arrays run around 95%," Curtis says, in large part because provisioning software is designed and written in-house. Storage administrators schedule the provisioning after stock exchange hours, and the software makes whatever zoning changes are needed to the fabric. Merrill Lynch is currently negotiating with commercial software vendors to take over the development and support of that provisioning system because "I'd rather have someone else doing the testing and point releases," Curtis says. "It's not part of our core competency."
The firm recently installed an application subscription system that changes the way that Merrill Lynch bills out storage (and all IT resources) to end-user departments. Now Curtis can tell users exactly what he's going to charge them every year for various applications. Every application now subscribes to resources in the data center, and "we have sponsors behind the applications. And we've become Amazon.com--we order it, track it and can change various line items. The result has been very tight management of our resources," he says.
When the company hits the end of the expansion road for its current SAN design, Riggsbee says, he'll be looking at technologies that allow for multipath access from his applications to different levels of storage. This will provide more redundancy, and will allow access to different types of nearline and online storage for backup. Ideally, he says, any new intelligent fabric and switches will provide for synchronous and asynchronous backup at the same time.
"We're seeing some talk in the SAN space about products moving into the fabric, and that's a great thing," Curtis says. One example, he says, would be the types of functions provided by Veritas' Volume Manager--configuring, sharing, managing and optimizing storage I/O performance without interrupting data availability. Curtis' major goals with this more fabric-based approach would be to make storage functions "easier to implement" and to more effectively use network bandwidth, he says.
Curtis wishes he could plug a server into a fabric with the same ease as plugging a PC into a network. "When you do that," he says, "you never have to worry about compatibility issues between the PC and the network ... until it gets much easier, companies will be cautious about committing to a SAN-attached environment because of the labor costs and complexity."
In the meantime, Merrill is beginning its annual storage review, something the company was starting as this story went to press. "We go through the assumptions and policies we have for the current year, and decide if they still hold true for the coming year," Curtis says. They apply what they know about their end-users' upcoming projects and other needs, as well as technology roadmaps from vendors to determine how their SAN will expand.
--By Johanna Ambrosio
Are smaller switches an option?
The 16- and 32-port switches that formed the basis of most storage area network (SAN) infrastructures became popular because of their relatively low incremental cost. Why not continue to expand, using this model and avoid the high initial capital expense of director-class devices? Couldn't this be viable as a "pay as you grow" option?
Using 32-port switches as building blocks, let's examine the possible growth of a small SAN (see "A two-tiered switch SAN design model"). A basic design assumption here is that a 10:1 ratio will be maintained between hosts and storage through inter-switch links (ISLs). The specific ratio will vary depending on the actual performance capabilities and requirements of the environment. It would be possible to add up to three more host switches (for a total of six) to this environment to support up to a total of 140 hosts (see "Basic two-tier fabric expanded"), and still maintain this ratio.
At this point, the SAN design would need to be modified. Continuing to use 32-port switches, the redesigned SAN might look like the figure called "Reconfigured two-tier fabric." This new configuration supports up to 216 host ports. The additional four switches (over the previous configuration) yield only 76 additional host connections.
Obviously, there are many other design options for expanding this SAN. However, it's clear that there's a crossover point where smaller switches are no longer economically feasible.
Beyond cost, the added complexities of maintaining and managing a mesh fabric of small switches are significant. It can be difficult to ensure consistent access to devices throughout the SAN. For instance, the need to find free ports to connect tape drives and other storage devices often results in non-optimal behavior, such as plugging into free host-side ports. The result is a jumble of servers and devices connected randomly throughout the fabric. Very quickly, any cost savings of switches over directors can dissipate due to these manageability issues. Smaller switches scale up to a point, and they are useful as edge devices, but it's critical to use some foresight to avoid these problems.
Credit Valley Hospital runs out of room
Sometimes, despite the best original plans, shops just run out of room. That's what happened at the Credit Valley Hospital, a facility in Mississauga, Ontario that sees more than 20,000 patients each year.
Credit Valley bought its initial storage area network (SAN) in the summer of 2002, expanded it for the first time with a few more switches in February 2003 and expanded a second time in May 2003 with more disk.
"We bought what we needed at the time, but we just needed more than we thought," says Leigh Popov, manager of technical services and telecommunications at the hospital. The first expansion was to double the number of ports, from 32 to 64 on its Brocade SilkWorm 3900 switches. The second growth spurt grew disk space from 7.7TB to around 10TB.
"We found that we used up the disk pretty fast just by transferring radiology to the SAN," Popov says.
At the same time the hospital is building its core SAN, it has also been constructing a 500-slot tape library--an IBM 3584 LTO system--as its nearline backup facility. The hospital is using LTO to replace its aging optical jukeboxes and StorageTek DLT devices.
Next up for the SAN is adding all of the hospital's 65 servers to the server-less backup environment; about 12 are hooked in right now and the rest are being backed up to the DLT and jukeboxes across the LAN. Server-less backup, done via Legato NetWorker, allows for the servers that manage large amounts of data to back themselves up across the SAN "with no impact on the LAN," Popov explains. Each server creates its own schedule, runs its own backup, then puts the backups onto tape and places the index of those backups to the main index on the main backup server.
Also, cardiology is starting to come online, which will add quite a lot of storage to the managed pool: A typical echocardiogram can require up to 1GB.
All told, Popov says, "I'm looking forward maybe three years. It's difficult to go any farther out than that, and I kind of laugh when I hear vendors say they're going to sell me something that will last for 10 years. "
When the hospital bought the SAN originally, he says, "We slated it for a life span of maybe four to five years, and we made sure that it was modular enough to switch things around for that time frame." Disk can grow to around 50TB, from its current 10TB, and if the hospital needs more switches to accommodate its growth, it can just add one or more to the fabric.
--By Johanna Ambrosio
Building on the old SAN
The Wake Forest University Baptist Medical Center in North Carolina began building its storage area network (SAN) around two years ago, says Bob Massengill, manager of technical services at the medical center, which operates 20 subsidiary hospitals as well as 90 clinics around the region. "We had to prove the concept" of a SAN to its internal users, which had been used to direct-attached storage (DAS) from IBM Corp. Going with a SAN, and then with EMC Corp. as its primary storage vendor took some getting used to. The medical center made the switch mainly because IBM's snapshot software for the mainframe required a full-volume update each time, where EMC's didn't. The center currently uses an older Symmetrix model 8830, with around 22TB of usable storage, and a newer DMX 2000 with 8TB of usable storage.
SAN usage started slowly because the SAN was a new idea and people were accustomed to the DAS world. Over time, though, many of the people in the end-user groups who had been reluctant found out that the SAN was only a centralized storage area, which the end users can continue to manage as they see fit. "I allocate," Massengill says, "then they do what they want or need."
The center began with two 16-port Sphereon 3000 McData switches. But after a year, users began clamoring to get their data over to the SAN environment, and an upgrade was needed, he says. That's when the center ordered its two 64-port Intrepid 6000 Series director-class switches, the second of which just went online. (The two older switches will continue to be used for production purposes until all the servers are moved over to the director switches and, eventually, will be used for backup only.) Now the first SAN-based application--e-mail--has been joined by financial applications and radiology.
The two director-class switches are separate from each other, by design, for backup reasons, Massengill says. Each of the 35 servers is attached to both switches, so in case one switch goes down the servers can still see the storage. In addition to needing two ports for each server, the StorageTek tape silo eats up many ports as well, he says.
Still coming is a full disaster recovery application for all the center's 400 servers. "This gets back to the benefit of not losing those two 16-port switches," Massengill says.
The original SAN connected around 7TB of storage with around 10 servers going through the two McData 16-port switches. He expects that the second 64-port switch will be fully populated by next summer, and then he'll probably get another for additional servers coming online to the SAN. In the meantime, he can always purchase a four-port card to turn one port into four, for a "couple of hundred bucks." That's cheaper than having a fully loaded director-level switch sitting around waiting to be used, he says.
--By Johanna Ambrosio
Guardian Life swaps the new for the old
The Guardian Life Insurance Company of America, based in New York, NY, took a middle ground between storage- and network-centric storage area network (SAN) expansion. Although the firm evaluated both the fabric and storage decisions separately, it decided to go with the fabric and switches suggested by EMC Corp.
The firm just bought new DMX 2000 arrays, as well as a 192-port McData fabric, for its Bethlehem, PA-based data center, says Bob Mathers, second vice president of IT operations at Guardian Life. When the fabric is fully populated in this go-round, it will probably be at around 120-port utilization.
At the same time, Guardian is implementing a backup SAN--exactly matching the setup in Bethlehem--for a new backup data center in Pittsfield, MA. The primary center will replicate data asynchronously to the backup center every couple of hours; about 60% of the primary production environment will be backed up to the second data center, which becomes primary if the first fails.
The new gear is replacing some Brocade switches, as well as older EMC storage boxes and some IBM Shark devices. "I was open to continuing to have a multivendor environment," Mathers said. "But as we went through the process, we felt that EMC was the vendor of choice. They were more proactive and more responsive through the RFP process--and I've told IBM that."
A big part of the reason behind the expansion--besides just running out of space on the older SAN gear--was the idea of computing-on-demand. "We want to go with this in every area we can," Mathers explains. "We want extra capacity, but we don't want to pay for it until we need it." In this case, they will be using around 70TB right off the bat, but will have around 90TB worth of capacity for when they need it later.
On the fabric side, both EMC and the other major vendor in the running--IBM--had recommended the use of McData Corp. "If we had wanted Brocade, they would have been fine with that," Mathers says. "But we felt that McData was the better product. McData directors have been around a while, where Brocade directors were new to the market," he explains.
With EMC, Mathers says, "We have a very strong storage partner who's in line with our capacity-on-demand model."
--By Johanna Ambrosio