Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

When data storage functions should reside in the VM

Expert Phil Goodwin provides tips on when it's better for storage functionality to be put into VMs, and when it should be left up to the array.

Both VMware and Microsoft are putting traditional storage functions and storage application programming interfaces into their virtual machine kernels. But is it really a good idea to burden the processor with functions that can be offloaded to the storage array? In this technical tip by SearchStorage contributor Phil Goodwin, he examines when it makes sense to do so, when it does not and how you can tell the difference for your scenario.

When it comes to the intense competition among virtualization vendors, a whole new front has been opened around storage functionality. With the battle raging fiercely, VMware and Microsoft are continually attempting to one-up the other with new, better and more capable features and functions. All this competition is to the unalloyed good of IT organizations. However, the majority of virtual implementations are in conjunction with networked storage of some variety. With overlapping capabilities, it is important that virtualization architects carefully consider where functionality belongs before assuming that central management of all utilities is necessarily the best.

It’s been decades now, but there was a time when all functionality was concentrated in the host. Input/output (I/O) operations and file systems were entirely managed by the host; disks were dumb, direct-attached devices. As functionality expanded, the CPU, memory and backplane (bus) became the performance bottleneck. Storage vendors responded to this market opportunity by providing more functionality in the storage array to the point where each now has its own operating system and embedded file system. With virtualization engines now handling more and more data movement, are we doomed to repeat history?

The storage functions being offered by virtualization vendors closely resemble those which are available on storage arrays: shared storage (including internal storage, unlike arrays), load balancing, multi-pathing and tiering. Virtual machine (VM) vendors often enable these functions by deploying their own file systems.

Advantages to packing functionality in the VM

There are several major advantages to concentrating functionality in the VM. First, given the dynamic nature of virtual environments, having a tight link between compute elements reduces the chances for unintended side effects of that dynamism. Second, centralized operational management has obvious benefits. Finally, organizations can implement inexpensive “white box” or software-defined storage rather than expensive storage arrays. It can also simplify system upgrades and migrations.

Despite these advantages, storage vendors needn’t be concerned about getting the bum’s rush out the door anytime soon. Data movement demands can be extremely resource-intensive, consuming all available memory and CPU cycles as well as saturating I/O channels. (Note that such operations can also saturate networks and hard disk I/O, but that’s true regardless of where the data movement application resides.) Examples of especially intensive operations are mirroring, replication and backup or recovery. Thus, when one VM runs an intensive operation, it can negatively impact all the VMs on the physical server. Vendors may attempt to avoid problems by limiting the number of simultaneous data movement jobs as well as improving the way features operate, but the consequences are not always foreseeable. The benefits of centralized management can be offset by more complex tuning, more physical servers and more frequent VM movement, which may only compound the issue.

Given the pros and cons, here are some guidelines on when to use the VM as a storage server and when to let an array shoulder the load:

Let the VM be the storage server in these situations:

  1. Small deployments, development prototype systems or low-performance applications
  2. Hybrid cloud environments where the cloud storage target is commodity storage
  3. When using “white box” storage or software-defined storage
  4. When VMs and data need to move together, regardless of performance implications

Leave it to the array in these situations:

  1. Enterprise deployments where the storage functionality is there anyway and corporate data management standards govern
  2. High-performance applications
  3. High-density virtual environments where one application can impact a number of others

The best of both worlds is evolving, where storage vendors are leveraging VM application programming interfaces to tightly integrate array functionality into the VM while leaving the array to offload the work. Market opportunities do not remain void for long.

Dig Deeper on Storage for virtual environments

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

Do you think storage functionality should reside in the VM kernel or be offloaded to the array?
Depends on what you wntto do.
I don't want to end up with the equivalent of a mainframe - there is a reason we have both types of compute environments
I agree with you. Virtualization causes consolidation of applications into VMs but the container for storage (block devices-luns or filesystems) have not changed along with that. The storage needs to be managed at the VM level and new storage technologies are going just that - VM aware storage. Tintri (who I work for) and VMware are leading the charge on this. Controlling storage at the VM level - IOPs, replication, snapshot, cloning etc - is the way we should be managing.