Published: 01 Aug 2009
Virtual servers have solved a lot of problems in the data center, but they've also made backup a lot harder. There are several ways to back up virtual servers, each with unique advantages and disadvantages.
Backup is the single biggest gotcha for VMware nirvana in large environments today. The usual backup methods cause many environments to limit the number of virtual machines (VMs) they place on a single ESX server, decreasing the overall value proposition of virtualizing servers. To further compound matters, one possible solution to the problem requires purchasing additional physical machines to back up the virtual machines (VMs).
However, there are existing products that can solve the problem, if you're willing to move your VMware environment to different storage. If that's not possible, there are some "Band Aid" remedies that can help until storage-independent products arrive. However you ultimately address virtual machine backup, you can at least take some comfort in knowing that you're not alone in your frustration.
The problem is physics
Whenever I consider VMware, I find my mind turning to the movie The Matrix. The millions of VMs running inside VMware are very similar to all of the virtual people living inside the movie's matrix. As with the movie, when you plug into the "matrix" -- VMware, in this case -- you can do all sorts of neat things. In the matrix, you can fly through the air; with VMware, VMs can "fly" from one physical machine to another without so much as a hiccup. In the matrix, you can learn Kung Fu and fly a helicopter in seconds. In VMware, a virtual machine can run on hardware it was never designed for thanks to the HyperVisor.
But when you die in the matrix, you die in real life because your body can't tell the difference between virtual pain and physical pain. Similarly, VMware can't break the connection between virtual worlds and physical worlds. Although those 20 VMs running within a single ESX system may think they're 20 physical servers, there's just one physical server with one I/O system and, typically, one storage system. So when your backup system treats them like 20 physical servers, you find out very quickly that they're running in one physical server.
Usual solution: Denial
Most VMware users simply pretend their virtual machines are physical machines. In various seminar venues, I've polled approximately 5,000 users to see how they're handling VMware backups. Consistently, only a small fraction of those who have virtualized their servers with VMware are also using VMware Consolidated Backup (VCB). The majority simply use their backup software just as they would with physical servers.
There's nothing wrong with doing it that way. A lot of backup administrators suffer from a "VMware backup inferiority complex" because they think they're the only ones doing VM backups that way. They're actually part of a large majority.
If you're doing VM backups that way and they're working, don't worry. The good thing about doing conventional backups is the simplicity of the process. Virtual machine backups work the same as "real" backups; you have access to file-level recovery, database and app agents, and incremental backups (see "Improving old-style virtual machine backups," below).
|Improving old-style virtual machine backups|
You can do several things to improve backups of your virtual servers if you're using traditional software.
Backup from inside ESX server
Another option is to run your backup software at the physical level inside the ESX server. But things get ugly quickly and you'll find yourself doing full backups every day. You're also likely to be doing this without any support from your backup software company, which has little incentive to make this method work. (They'd much rather you use VCB or even the typical agent approach as they get more revenue that way.) The reason you end up doing full backups every day is because any change in the VM results in a modification of the timestamp of its associated VMDK files. So even an "incremental" backup will be the same as a full. This is rarely the best approach to virtual server backup.
VMware Consolidated Backup: Hope or hype?
VMware's answer to the backup dilemma is VMware Consolidated Backup. To use VCB, you must install a physical Windows server next to your ESX server and give it access to the storage that you're using for your VMFS file systems. It can access both block-based (Fibre Channel and iSCSI) and NFS-based data stores. The server then acts as a proxy server to back up the data stores without the I/O having to go through the ESX server.
There are two general ways a backup application interacts with VMware Consolidated Backup. The first method only works for Windows-based VMs. With this method, the backup application tells VMware via the VCB interface that it wants to do a backup. VMware performs a Virtual Shadow Copy Service (VSS) snapshot on Windows virtual machines and then performs a VMware-level snapshot that's exported via VCB to the proxy server as a virtual drive letter. (The "C:" drive on the VM becomes the "H:" drive on the proxy server.) Your backup software can then perform standard full and incremental backups of that virtual drive.
The main advantage to this method is the ability to perform incremental backups. The disadvantages are that it's Windows only, there's no official support for applications (including VSS-aware apps) and no ability to recover the VM itself, only the files within the virtual machine.
Alternatively, you can use the full-volume method. VMware performs VSS snapshots as before, but can also perform syncs for non-Windows VMs. However, with this method, the raw volumes the VMDKs represent are physically copied (i.e., staged) from the VMFS storage to storage on the proxy server. Although there's no I/O load on the ESX server itself, this approach places an I/O load on the VMFS storage that's the same as a full backup.
With standard backup products, this staged copy of the raw volume is then "backed up" to tape or disk before it's considered an actual backup. This means that each full backup actually has the I/O load of two full backups. And unless the backup software does a lot of extra work, there are no such things as incremental backups. That means VCB -- with a few exceptions -- creates the I/O load equivalent of two full backups every day.
Symantec Corp. and CommVault have figured out ways to do incremental backups. Symantec uses the full-volume method for the full backup and the file-level method for the incremental backup, and then uses the FlashBackup technology borrowed from Veritas NetBackup to associate the two. Symantec's method significantly reduces the I/O load on the data store by doing the incremental backup this way; however, it requires a multi-step restore of first laying down the full volume and then restoring each incremental backup against that volume. This restore method is cumbersome, to say the least. CommVault's method is to perform a block-level incremental backup against the raw volume, which is a "truer" incremental backup that offers an easier (and possibly faster) restore than Symantec's approach. However, it must be understood that CommVault's method still requires copying the entire volume from the data store to the proxy server. Therefore, their incremental backup places the equivalent of the I/O load of a full backup on the data store every day.
Restoring a VM also requires two steps. Your backup software restores the appropriate data to the proxy server and then uses VMware vCenter Converter to restore that to the ESX server. If the backup software supports it, it can do individual file restores by putting an agent on the virtual machine and restoring directly to it; however, restoring the entire VM must be done via the two-step method.
All of these issues contribute to the relatively limited adoption of VCB as a backup solution for VMs. While VMware said VCB has been licensed fairly extensively, my experience indicates that a good number of those license holders have yet to implement it. There's some hope for a better backup process, however, with VMware's vSphere (see "What about vSphere?" below).
|What about vSphere?|
VMware Inc. vSphere is the next-gen architecture that's going to add some new features and resolve some backup issues. In particular, it's going to remove the "two copies" problem with VMware Consolidated Backup (which will no longer be called VCB), and it will allow true incremental backups of the VMDK files. VMware vSphere is here and available, but it will be a while (probably six months to a year) before we see these features adopted in backup software.
Some help from point products
There are a few "point products" designed specifically to address VM backup that can be incorporated into the backup process to address some of these issues. VizionCore Inc. was early to the market with its vRanger Pro product, and has been doing VMware backups longer than anyone else. Another popular alternative is esXpress from PHD Virtual Technologies. Both products are able to do VMDK-level full and incremental backups, and file-level restores with or without VCB. The two products think and behave very differently, however, so make sure you find the best match for your environment. Note that volume-level backups with both products still require reading the entire VMDK file, even if they only write a portion of it in an incremental backup.
You can also use source deduplication backup software, such as Asigra Inc.'s Asigra, EMC Corp.'s Avamar or Symantec's NetBackup PureDisk. The first way source deduplication backup software can be used is by installing it on the VM where it can perform regular backups. However, source deduplication backup requires fewer CPU cycles and is less I/O-intensive than a regular backup (even an incremental one), so it significantly reduces the impact on the ESX server. Doing backups this way also lets you use any database/application agents that the products may offer. The downside is that you're not usually able to do a "bare metal" restore of a VM if this is the only backup you do.
Some products take this approach a bit further by running a backup inside the ESX server itself, capturing the extra blocks necessary to restore the virtual machine. But this method requires the backup app to read all the blocks in all of the VMDK files to figure out which ones have changed. That could significantly impact I/O on the CPU as it calculates and looks up all those hashes.
CDP and near-CDP approaches
Continuous data protection (CDP) and near-CDP backup products are used in much the same way that deduplication software is used. They're installed on your VM and back up virtual machines as they would any other physical server. The CPU and I/O impact of such a backup is very low. Most CDP software won't allow you to recover the entire machine, so you'll need to have an alternative if your VM is damaged or deleted.
So far, all of the methods covered have as many disadvantages as advantages -- if not more. But there's a completely different solution that merits serious consideration: Use a storage system that has VMware-aware near-CDP backup already built into it. (Keep in mind that near-CDP is just a fancy name for snapshots and replication.) Dell EqualLogic, FalconStor Software Inc. and NetApp all have this ability. Other storage vendors are developing similar capabilities, so check with your storage vendor.
The concept is relatively simple. VMDKs are stored on their storage, and each has a tool designed for VMware that you can run to tell it to back up VMware. VMware then performs a snapshot similar to what it does for VCB, allowing your storage box to then perform its own snapshot of the VMware snapshot. Replicate that backup to another box and you have yourself a backup.
The CPU hit on the ESX server is minimal. And the I/O hit on the storage is also minimal, as all it has to do is take a snapshot and then perform a smart, block-level incremental of today's new blocks by replicating them to another system. (Note that this block-level incremental is being done by the storage that already knows which blocks need to be copied, so the I/O impact is as low as it can be.) Vendors that offer these capabilities have their own ways of providing file-level restores from these backups as well.
Dell EqualLogic systems, because they're iSCSI, can communicate directly with the virtual machines via IP to coordinate the snapshots. FalconStor has agents that run in all your VMs to coordinate snapshots and do the "right thing" for a number of applications. NetApp uses VMware tools to do snapshots; however, NetApp's truly unique trait is that it can dedupe VMware data -- even live data. Think of all of the redundant blocks of data you can get rid of by using the deduplication tool included with NetApp's Data Ontap operating system.
Bottom line for VM backup
There are a number of technologies you can deploy today to make VMware backups better. However, many of them are still saddled with disadvantages, especially when compared to traditional backup processes. Perhaps the best current alternative is to move your VMware instances to VMware-aware near-CDP-capable storage. Or maybe VMware will solve some of these backup problems with vSphere.
BIO: W. Curtis Preston is an executive editor in TechTarget's Storage Media Group, as well as an independent backup expert.