While most of the buzz around server virtualization in general, and VMware Infrastructure in particular, have been about server consolidation and greening the data center, disaster recovery may be the IT area where server virtualization technology has the biggest impact.
Disaster recovery (DR) planning for mission-critical applications historically called for replicating the data for these applications and having servers standing by at the DR site ready to take over at a moment's notice.
Most organizations can save money by virtualizing these standby servers. A single offsite server can act as the standby domain controller, SQL server, Exchange server and several more. Not only can you save the cost of all those physical servers, but also the rack space and power charges from your DR site.
Saving money and still providing the same level of protection that your old expensive physical server solution could is a good thing. But the real payoff is improving the recovery time of the applications that you wouldn't dedicate a standby server to. Most organizations soon realize they can move some applications up from the secondary tier to having standby servers, since the standby servers are essentially free.
Solving bare metal restore to different hardware
In the "old days," secondary applications were limited to restore from tape as their protection model, resulting in multiday recovery points and recovery times. Even if you were replicating the application's data, it wasn't always possible to get an identical server to restore the application backup to. You either had to go down the dank dark path of a bare metal restore to different hardware, or pursue a new OS and application install, all the while hoping you had a record of all the patches needed to mount that database.
The "different hardware" problem is solved because virtual machines are indeed virtual machines -- they all run with the same set of drivers and can't tell if they've been moved from one host to another. In addition, virtual machine snapshots from VMware or even Microsoft's Virtual Server or Hyper-V are just files, so restoring a virtual machine is just a matter of mounting the files on a new host.
Rather than relying on tape transfers, you can schedule snapshots of your virtual machines and transfer them to the DR site over the replication link. And if your network guys can prioritize traffic properly, it won't interfere with real-time replication.
The real fun comes when a disaster is declared and you have to start switching over to the standby servers. Because the suspenders-and-belt crowd set up their DR infrastructure to be able to take over at full speed the minute the switch was thrown, their DR site has lots of compute horsepower. (Of course lots of horsepower means lots of money.)
The more frugal companies take advantage of VMotion, which moves virtual servers from one host to another dynamically while they're still running and, in addition, DR providers like SunGard's "shared server" offerings. With shared servers, you pay a few shekels to the DR provider every month for the right to claim servers out of their stock at the DR site when you declare an emergency. Once you declare that, you get the servers for your exclusive use and can install VMWare ESX on them.
Then, once the new hosts are up, you can use VMotion (or even better VMware DRS) tol dynamically allocate virtual servers to hosts based on load and to mount your virtual servers on the new hosts. This will boost your application performance. . .probably before your users can get to their new workplaces to use the applications.
Note: The same trick -- albeit with longer recovery times -- can be used with vendor's server quick ship programs that will ship you new servers in the event of a disaster.
About the author: Howard Marks is chief scientist of Networks Are Our Lives Inc., a Hoboken, N.J., network and storage consulting and education firm. Marks' company specializes In bringing the infrastructures and processes of mid-market firms up to enterprise standards in the areas of systems, network and storage management, with a focus on data protection and business continuity planning. Marks is the author of three books and more than 200 articles on network and storage topics since 1987. He is a frequent speaker at industry conferences.