Virtual machine (VM) backup is more sophisticated now, but finding the perfect VM app for your environment means weighing the importance of many new capabilities.
Data protection in virtual server environments has improved significantly over the past few years. Virtual server backup has evolved from backing up virtual machines (VMs) as if they were physical systems, to backing up OS guests, to backup applications specifically designed to take full advantage of the virtualized state.
There’s a core set of features that anyone managing a virtual server environment should expect from a backup app for virtualized servers and, while not all inclusive, these are proven technologies that can make a significant difference in the overall backup process. When evaluating vendors and their products, it’s important to ensure these capabilities and features are included, but it’s equally important to understand how they implement the technologies as there can be variations from vendor to vendor.
Changed Block Tracking
Almost every backup application designed to protect virtual servers, especially those running VMware, supports Changed Block Tracking (CBT). With CBT, instead of backing up the entire VM -- or even just its changed files -- each time a backup job is initiated, only changed blocks within the VM are backed up. CBT dramatically reduces the amount of data that needs to be sent across the network to the backup server or backup target.
Typically, the process starts with the virtual server backup application performing a snapshot of the backup device so the previous backup state can be preserved. When the CBT backup occurs, the previous backups of the VMs on the backup device are updated with the changed blocks. Once the CBT transfer process is complete, the copy on the backup device represents the most recent backup data with previous backups stored as snapshots for point-in-time recoveries.
In the past, one shortcoming of a CBT backup was that to recover a file from the backup instance, the entire instance had to be restored first. In that regard, CBT is similar to image-based backups. However, unlike image backups, modern virtual server backup software applications can recover components such as individual files and email messages from within the image.
Granular recoveries are typically done by mounting the VM directly from the backup devices and copying out the data, or via a recovery wizard that can peer into the VM and extract individual components. The recovery wizard method is preferred as it involves fewer steps and can be done from a single interface.
Rapid virtual server recovery
The next capability to expect from your virtual server backup app is the ability to recover a server very quickly or directly from the backup device. Some of the speed advantage of virtual server recovery derives from the way it encapsulates an entire server into a file. It’s faster to move a file back into place as a module than it is to reassemble that server from potentially thousands of files.
A few software applications have added the capability to recover only the blocks that need to be updated to satisfy a restore request. For example, if a VM needs to be recovered to how it looked five days ago, the software understands that only a percentage of the blocks within that virtual machine have changed during that time. This means the app knows that only those blocks need to be copied back, replacing newer blocks to make the VM look the way it used to. Essentially, this is CBT in reverse.
Another option, often called “recover in place,” is the ability to start a VM directly from the backup device. In every case we’ve seen so far, this means the backup target must be a disk device. With this capability, previous versions or copies of the current versions of a virtual server can be recovered within seconds. Some applications have the ability to automatically isolate the recovered VMs into a private virtual lab so they don’t conflict with production VMs. This brings a lot of flexibility for testing development code or a disaster recovery (DR) situation.
When comparing the two capabilities, CBT has the advantage of putting the VM back on production data storage very quickly but it may take a few moments to recover that virtual machine. Products with recover-in-place technology will have an advantage in their initial return to operations. However, because those VMs will have to initially execute from the backup, device runtime performance may be affected. Backup devices usually don’t offer the same availability as production data storage. This means that at some point the VM will have to be moved back into production via VMware Inc.’s Storage vMotion or a similar product.
Another consideration regarding recover in place is that the performance of the backup device becomes a key factor, making its ability to handle random I/O operations critical. While some access to the application is better than no access, if that application is so slow that it can’t be used, then recover in place is far less appealing than a changed block recovery. Ironically, all the features that make a backup device appealing, like high-capacity drives and data deduplication, may now become problematic. In no case is this considered a “deal breaker,” but it’s a scenario that should be tested and planned for.
One of the key goals for any backup application is to facilitate the creation of an off-site copy of data. In legacy backup days, this was a matter of creating an extra copy to another set of tapes. Today, this often involves leveraging CBT to send changed blocks over a WAN connection for reassembly at a DR site. While many virtual server backup products have a replication module, it’s often a separate step in the backup process. Another option is to use a disk backup appliance that provides replication as part of its feature set, keeping in mind the recover-in-place concerns mentioned above.
Must have or nice to have?
There are also a number of backup app features that, in some cases, may be more “nice to have” than required. As always, their importance will largely depend on the data center in which they’re used.
VM-specific vs. legacy backup apps
While not exactly a feature discussion, one of the questions you’ll need to consider when reviewing virtual server backup apps is whether you should opt for one of the new breed of applications designed specifically for virtual environments or if a legacy enterprise backup app will suffice. VM-specific applications became popular because enterprise backup applications were slow in moving to full support of virtual server environments.
Other than the recover-in-place capabilities noted earlier, most legacy backup applications have filled in the gaps and now offer many of the capabilities of VM-specific products. In addition, legacy enterprise backup applications have the ability to support tape, advanced reporting and can back up the non-virtualized part of the environment. Any of those factors may make them more complete options.
Support for other hypervisors
While each vendor makes a big deal out of its recent support of Hyper-V, it’s still not heavily used in most environments. Still, with Windows Server prevalent in most data centers, Hyper-V is likely to be considered for server virtualization, so a backup application’s Hyper-V support is important. Hyper-V products may not be able to leverage the same kinds of hooks and APIs a VMware environment may offer, so backup vendors often have to be more creative. It’s a good idea to plan for extra testing of features such as CBT when they’re supported in Hyper-V environments.
Agents vs. agentless
There has been some debate about the use of agents: Should the backup intelligence reside in the virtual server OS, the host or on a separate piece of hardware? While conventional wisdom states the agentless approach is better when no software is required in the guest OS, this may not actually be the case. If an agent is written correctly, agent-based backup can deliver as good or better performance and potentially better scalability.
Also, even agentless products may still need to deploy an agent to get application-specific information. For example, granular backup of Exchange, SQL Server and SharePoint usually requires an agent.
Two key issues are still weaknesses for most virtual server backup products. The first, as mentioned earlier, is tape support. While the use of tape has diminished in many data centers, it’s often an integral part of backup operations at many companies. With the availability of high-capacity, high-performance media, many companies are actually looking to expand the role of tape in their environment. From a virtual server backup perspective, it’s an ideal way to store older backups for a longer period of time. Being able to selectively push to tape a series of snapshots you want to keep but won’t need instantly would free up capacity on the disk appliance and put older data on a medium designed to sit on a shelf for years.
The second issue is reporting. Advanced reporting needs to be developed to identify which virtual servers haven’t been protected in a given period of time or aren’t protected at all. In addition, effective reporting could provide information about resources consumed during the backup process so better load balancing could be arranged.
The state of virtual server backup is improving rapidly and there are many options available from a variety of vendors. The functionality gaps among those vendors are closing rapidly as well. Many of the features described here have become standard fare in most VM backup apps, but your environment -- the way it’s currently configured and its anticipated growth -- will determine which ones are required and which would be nice to have.
BIO: George Crump is president of Storage Switzerland, an IT analyst firm focused on storage and virtualization.