Published: 08 Aug 2002
For all of the hype about virtualization, policy-based management and automated control of the storage network, few solutions available today can actually be deployed, and few of those offer truly seamless allocation and management of storage. In practice, most users are limited to the basic access control built into their switch and storage hardware. To paraphrase a once-popular bumper sticker, storage managers do it the old-fashioned way. They manage SANs by LUN masking on storage arrays, LUN masking/access control on a host and switch-based zoning in the fabric.
Novice users who first start exploring SANs often believe they can simply buy off-the-shelf components, plug them in and instantly have a network which they can easily share and reallocate storage on. In this idealworld, users could automatically share files and directories between servers with no conflicts, and would easily be able to share files across platforms. Users would also be able to create new storage and share it among servers connected to the SAN - all through their operating system's disk management tools.
Unfortunately, that's not the case. In real life, trying to plug and play a storage network is inviting data disaster. The reality of storage networks is that careful administration must occur in order for everything to work smoothly. To add storage you can't just plug in, for example, an array and expect it to be available to the correct hosts or interoperate properly. The addition of a storage array is usually a carefully scripted process, where the array is rigorously tested against a duplicate environment and then carefully scheduled for addition to a network to avoid disrupting running applications.
Windows reigns supreme
Because storage networking technology was only developed recently, few operating systems are aware of the storage network, and assume that all the storage they see is captive to the device. SANs evolved from parallel SCSI roots, so most operating systems continue to use the outdated model of dedicated storage, treating the storage network as just a long SCSI bus. This leads to problems as those systems bump up against assumptions which no longer apply to shared storage.
One of the most often cited issues by administrators is the behavior of the Windows operating system. Windows systems write an identification label to every disk discovered by Disk Administrator, which is everything on a storage network. On a SAN with shared storage, this label will be written to disks that may be used by another operating system - for example, Solaris - often resulting in the corruption of that shared volume.
Don Whitlow, a storage administrator at Sussex, WI-based Quad/Graphics, manages two SAN islands of about 2TB each. He says, "Windows NT will clobber everything it sees on the SAN because it wants to own everything it can see." As a result, he says, "We have to carefully manage our network to prevent this from happening."
It's in the box
Luckily for users, the hardware available today offers a wealth of features which ease the management of the SAN. From storage arrays to HBAs and switches, SAN hardware has a wealth of features available to users to help manage their storage.
Whitlow says, "We use LUN-based masking at the HBA, along with switch zoning to control our SAN," which allows him to fully manage equipment with a minimal amount of trouble. "That's worked out really well for us." His site has three Compaq subsystems, several Compaq-branded 16-port switches and a variety of HBAs in IBM-AIX and NT boxes on the same SAN. "Either one of the techniques would work, however, we're using both together very successfully," he says.
Switch-based zoning is the ability of a Fibre Channel (FC) switch to limit access to storage devices from selected hosts. Through software or hardware, the switch prevents devices from seeing each other if they aren't part of the same zone. For example, you could create a zone that only allows Solaris hosts, or a zone which is limited to your production database cluster. Other devices won't be able to see your storage or your hosts.
Zones can be set up in one of three ways: The first - and simplest - is to restrict traffic between the physical ports of a switch. This is referred to as port zoning, and is used when users want to restrict physical connections to a switch, regardless of what host or piece of storage is attached in that location. The second, most commonly used technique is setting up a zone based on the World Wide Port Name (WWPN) of the devices. The WWPN uniquely identifies every FC port in a SAN, and by specifying a zone by a WWPN, a port is always part of that zone - even if the topology and configuration of a network changes - or if the port is accidentally moved or plugged into another port on the network.
When using zoning and LUN masking techniques, managing network changes becomes critical, due to the need to keep zones and access lists up to date. For example, if you need to replace or add a new HBA into a server, the WWN of that HBA must be properly added to all zones and storage LUN access controls for storage to be seen by that HBA. This adds a significant amount of information which administrators need to keep in sync when components are added, replaced or removed.
Another issue to consider is when you are merging different fabrics. Because of the complexity of merging separate zoning definitions, sometimes a condition can occur called a segmented fabric - a fabric which will not merge because of conflicts between zones. This doesn't often occur, but when it does, it requires administrators to carefully check what definitions are correct, remove erroneous or incorrect information from zone sets, or ensure no zones are defined on one of the fabrics before the merge.
Even so, zoning and LUN masking can also be a powerful tool for managing changes. By implementing zoning and LUN masking, inadvertent changes can be prevented from happening in a network. Zoning and LUN masking can even be used to create a zone for new hardware, configure, test and prequalify equipment, before merging those zones with existing configurations and making them live.
Mark Antonaccio, a SAN designer working for an East Coast reseller says, "I like to use port-level zoning, combined with a dual redundant fabric. The ability to move from one port to another using World Wide Name [WWN] zoning isn't as important in a resilient design." Antonaccio relies extensively on Brocade's Fabric Manager to handle the storage networks he works on.
Whitlow uses the zoning built into his Compaq switches (OEM'd from Brocade) as well. "I create a zone for my AIX boxes and a separate one for my NT boxes to prevent NT from clobbering drives," he says.
Zoning is limited, however, in its ability to manage at the LUN level. Most switch zones are unable to limit access to individual LUNs on a large storage array, although some vendors are starting to add this capability to their hardware. This limitation means that arrays, even though available in a SAN, still end up inefficiently partitioned - particularly for large storage arrays. Plus, access can't be limited to hosts which need access to the same array port, but not the same logical unit. This leads to the use of LUN masking at either the storage device or at the HBA.
Controlling access at the array
Storage-based LUN masking allows for finer-grained control over storage allocation, down to the LUN level. Using onboard controls on storage arrays, storage-based LUN masking can ensure users never access the wrong storage, and can guarantee there will never be accidental data corruption.
At the storage array, hardware or software ensures that the WWPN of the port - which has been given access to a LUN - is in the access list that has been set up by the administrator. When a command is sent to the device, it will be rejected unless there's a match to the inquiring device.
Most of the high-end storage arrays offer LUN masking. Whitlow uses storage-based LUN masking on his Compaq StorageWorks arrays to control allocation of storage to different hosts. He says, "We use LUN masking to implicitly say a LUN can see a specific connection, specified by World Wide Name." His ESA10000 and EMA10000 boxes provide control over which hosts can see a specific storage LUN. Other high-end storage arrays - such as EMC and Hitachi - also offer similar control over LUN access.
Storage array-based LUN masking is either controlled from a GUI or by a command line. Generally, LUNs are configured, exported and made available to specific WWNs representing the HBAs in a host. Other hosts which try to access those volumes will be unable to get to the data. LUN masking also allows, in many cases, for presenting different LUN numbers to different hosts - for example, presenting every LUN as LUN 0 to hosts to enable booting from the SAN.
The main limitation of storage-based LUN masking is that it's only available on high-end arrays, which typically cost a minimum of several hundred thousand dollars; and isn't available on low-end JBOD systems. In addition, these solutions tend to be difficult to manage - frequently requiring a visit from a local service representative to change and modify settings - not the ideal for easy allocation and management of storage.
Access management at the host
HBA-based LUN masking is another option for controlling access to storage in the SAN. By using the HBA's persistent binding or LUN masking features, you can bind a specific LUN to a host and mask off other volumes in the SAN, preventing access to volumes which belong to other hosts. As with switch-based zoning and storage-based LUN masking, this helps in the allocation of storage in the SAN, particularly in the case where JBODs or storage without storage-based LUN masking is being used. The HBA's driver software works by restricting which volumes are exposed (bound) to the operating system. Only LUNs which have been unmasked and bound to a host are reported to the operating system, so other LUNs can never be claimed or written to by the OS.
Through software or files, specific WWNs and LUN combinations are included in the list of storage to access. These settings usually take effect on reboot, although some cards allow for real-time changing of these LUN masking settings.
The main limitation of HBA-based LUN masking is that it requires the participation of all of the servers on a SAN. This means that a rogue host or incorrectly configured host can easily corrupt data or cause other problems. With many hosts in the SAN, it gets exponentially more difficult to manage the number of hosts attached to the SAN without some central control.
With all of these various techniques required to keep a SAN up and running, you'd think users would jump on any solution to simplify management of the many components in their networks. However, when users learn these techniques, they like them. Quad/Graphics' Whitlow says, "To tell you the truth, the switch's built-in zoning is great, and command line-based control of the storage works fine. Until a tool comes along that is both affordable and performs as well as the built-in tools, we're not looking at other options." He says he's looked at a couple of management solutions to help out with SAN management, but so far, they're too expensive, and "We're not willing to bet our business on it."
In fact, resellers and vendors report that most storage is still bought and sold today based on the features of the hardware. Software and management are usually thought of after the fact, if at all. "We find that users only think about management as an afterthought," says Antonaccio.