This article can also be found in the Premium Editorial Download "Storage magazine: Storage salary survey: Are you being paid enough?."
Download it now to read this article plus other related content.
Remember all of those Y2K patches that we had to apply to the OS in 1999? Remember how we had to visit each server to apply those patches each time the vendor came out with a new bundle every quarter?
Remember asking yourself: "How can this process be automated?" Well, managing and booting your OS disks from the SAN gives you the ability to create a newly standardized (patched) OS disk for discovery, testing and then synchronizing with the child node in a mirror, thus enhancing system upgrades. And if testing should prove that the patch bundle was flawed, you can still fall back to the original OS image with a simple zone change for as long as the policy allows to keep that image around.
Disaster recovery is yet another business discipline in which SAN-based boot image management can prevail. The ability to drive multiple instances of a boot image to many potential OS disks can drastically improve the recovery time of operating systems in disaster recovery exercises. Following the attacks on Sept. 11, 2001, the brokerage firm that I was working with received a number of brand-new servers on the dock of its recovery site within hours of the disaster (see "
Although a moderate number of installs were proceeding in parallel, the sheer number of systems that had to be regenerated hindered the speed of the deployment. In addition, no one really knew what packages needed to be installed on each server or what answers were being supplied to the installation prompts across the group. The edict was simply to "get them installed!" Because the common thought was "it's better to be safe than sorry," the installation of the entire CD distribution was often the preferred choice, further extending the OS installation portion of the exercise.
In that situation, being able to quickly generate like-system images without any dependency on the number of CD installation media or the backup server, would have proven useful to resurrect so many servers in such a short period of time. And although you can do that using JumpStart's native IP functionality, without a SAN, the speed and broadcasting characteristics that are typical of a disaster recovery IP solution (i.e., 100Mb/s, routable LAN) would likely be insufficient for the mass rollout of a large number of servers. Unless you want to provision a Gigabit Ethernet connection over a flat network space for every application server slated for recovery, you are bound to run into a bottleneck when installing your boot images onto local disks via IP.
In contrast, with a SAN, once you've done the initial OS install on the JumpStart server and recovered the JumpStart application data with native Unix utilities, like-boot images can be created on independent disks and then served up to the recovered application server's HBA.
Streamlining this process in a disaster recovery effort has many benefits. Not only does it now take just one storage administrator to cook up some boot images, it also frees up precious system administrator time to concentrate on restoring application data once the backup and recovery environment has been certified.
Mirroring root disks across distances is yet another benefit of booting application servers from the SAN. In theory--and considering the minimal amount of data being driven to the OS disk(s) when compared to application data disks--any long-distance SAN link capable of sustaining synchronous data flow for a resource-intensive application should also be able to sustain mirror I/O between a local SAN-based boot disk and its remote mirrored partner. As always, extensive testing should be done between long distance points before assuming the link will support boot disk traffic.
Additionally, if you are mirroring swap files and your applications make heavy use of them, consider purchasing more system memory before testing the link. If your testing proves successful, then further testing will show that upon disaster declaration and following some massaging in the remote SAN, boot disks and root disk groups can be discovered and imported into a newly deployed server. At that point, the new server is ready for the recovery of its application data. If you already have this infrastructure in place, isn't this reason alone to test?
This was first published in December 2003