everythingpossible - Fotolia


Craft an effective data migration strategy for VMs

To carry out a successful data migration strategy in a virtual environment, organizations need to address factors such as data structure and resource contention.

When it comes time to virtualize an organization's servers, it's important to plan for any required data migrations.

The first step in any data migration strategy is to determine whether data actually needs to be migrated. Organizations have many different reasons for performing physical to virtual conversions. Depending on those reasons, it may be perfectly acceptable to virtualize your application servers but not your backend database servers. If an organization is trying to virtualize as many systems as possible so it can move everything to a standard platform, then a data migration will probably be required.

Once you have decided which data needs to be migrated, the following two questions need to be addressed:

Is the data structured or unstructured?

The type of data involved can make a big difference to the logistics of the data migration strategy. It is usually easier to migrate unstructured data (file data) because it can be migrated gradually using file system replication or backup and restore.

Structured data (database data) is trickier to migrate because it is always in use. Given the nature of structured data, a gradual migration probably won't be an option.

In most cases, mission-critical databases are configured as high-availability clusters. In such situations, it is often possible to virtualize individual database cluster nodes, building a guest cluster in the process. However, there are two major considerations with doing so:

  • If your hypervisor supports it, you will need to set up some anti-affinity rules to prevent guest cluster nodes from residing on a common physical host. Otherwise, a host-level failure could theoretically cause your database cluster to fail. While the database cluster nodes should fail over to a different hypervisor cluster node in the event of a host failure, you can greatly improve the odds of the database remaining online during such a failure by scattering guest cluster nodes across multiple nodes within your hypervisor cluster.
  • The data itself. The nodes within a failover database cluster rarely store data themselves. The cluster nodes are usually tied to a Cluster Shared Volume, and you must decide what to do with the data. Depending on where the data is located, you may be able to leave it in its original location -- but you must evaluate any hypervisor-specific limitations. For example, you will likely experience backup problems in a Hyper-V environment if you try to connect the guest cluster nodes to a storage repository by treating the storage as a SCSI pass-through disk. Storage mappings should be handled from inside the guest operating system.

How will the data migration strategy affect resource contention?

Resource contention can be thought of as competition among virtual machines (VMs) for hardware resources. The entire concept of server virtualization is based around resource contention. The idea is that server hardware has become sufficiently powerful to run multiple concurrent workloads. However, the available hardware resources (CPU, memory and storage) must be shared among the VMs that make use of the hardware.

Database servers tend to consume more hardware resources than other types of VMs. This is especially true when it comes to storage IOPS and storage capacity, so you must plan for this.

Ultimately, you will have to base your data migration process around your hypervisor and the hardware in your datacenter.

One mistake that is often made in virtual server environments is placing all the VMs and their data on a single Cluster Shared Volume. Hypervisor vendors strongly recommend the use of a Cluster Shared Volume, but there is no rule that states you must limit your virtualization infrastructure to a single Cluster Shared Volume. You can usually improve performance by using multiple Cluster Shared Volumes.

In the case of a virtualized database application, you might consider distributing your guest cluster nodes across as many Cluster Shared Volumes as you have available. That way, a Cluster Shared Volume failure (physical or logical) will be less likely to result in a failure than if all the guest cluster nodes shared a common Cluster Shared Volume.

The data itself should be placed on a completely separate physical storage device configured to act as a Cluster Shared Volume for the guest cluster, but not the host cluster. That way, you don't have to worry about unrelated VMs generating IOPS requests that interfere with database performance. Similarly, isolating the data to a dedicated storage device eliminates the possibility that a hypervisor-level Cluster Shared Volume failure will impact the data.

Ultimately, you will have to base your data migration strategy around your hypervisor and the hardware in your datacenter. Whether you use VMware, Hyper-V or something else, each hypervisor has its own nuances that will impact data migration.

Next Steps

Healthcare provider successfully migrates data with Seven10 software

How effective management can lead to easier VM capacity planning

10 steps to reduce risk in a data center migration

Dig Deeper on Storage migration