Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Data migration products: Proceed with caution

Host-based data migration products are the best choice for moving mission-critical data. While there are only a few products in this category, they vary significantly.

Host-based data migration products are the best choice for moving mission-critical data. While there are only a few products in this category, they vary significantly.

The days are numbered for the old ways of migrating data. Array-based options and native operating system utilities are no longer sufficient for migrating data for always-on mission-critical applications. As the amount of data that needs to move increases, so does the time and complexity associated with planning and executing the migration. Host-level data migration products automate, centralize and simplify data migrations, while ensuring the integrity of the moved data and that the application remains live. (This is the first of a three-part series on data migration products. The second article will focus on file system-based data migration products.)

Host-level data migration products allow users to:

  • Migrate data between internal and external storage disk devices
  • Migrate data between different brands and types of storage arrays
  • Allow applications to operate continuously during migration

Although there are only three block-based, host-level data migration products, they offer a wide choice in how to migrate data. Softek Storage Solutions Corp.'s Transparent Data Migration Facility (TDMF) migrates data between disk storage devices local to the server; Symantec Corp.'s Veritas Volume Replicator migrates data between different servers using TCP/IP; and Topio Inc.'s Topio Data Protection Suite (TDPS) performs both types of migration.

These products are used to complete the block-level movement of all data from one locally attached storage device to another, or from one server to a remote server. But don't expect them to totally eliminate the need for the application outages or server reboots that data migrations typically require. Differences in how they're installed and configured, and how they move data to the target volume, fall back to the source volume in the event of problems, and manage data flow and integrity during the migration are all factors to consider.

Softek recently conducted a data migration survey of 700 users and found that the most frequent reasons for data migrations were to replace storage or server equipment (38%), storage and server consolidations (17%) and the relocation of data (10%). The study also revealed that 39% of these users perform migrations on a weekly or monthly basis. To accomplish these types of data migrations with few or no application outages and minimal administrative intervention, block-based data migration tools support the following features to varying degrees:

  • Support for different protocols to allow for local and distance data migrations
  • Ability to throttle or control data flows between servers
  • Integrity checking to verify the consistency of the data during migration
  • Ability to fail back to the original storage device
  • A central interface to manage data migrations on different host systems

Installation and configuration

Users who opt for either Symantec's Veritas Storage Foundation or Veritas Volume Replicator to migrate data require an installed version of Symantec's VolumeManager (part of Veritas Storage Foundation). Softek's TDMF and TDMF-IP, and Topio's TDPS install a little differently. All of them require server reboots on Windows systems after installation because the product drivers are similar to antivirus programs, which means they install as filter drivers on Windows systems. However, unlike antivirus filter drivers that reside above the file system, these products sit below the file system and above Windows volume manager to capture and copy write I/Os.

On Unix operating systems, these three products don't require server reboots because they don't make modifications to the Unix kernel. These products take advantage of the better support Unix kernels provide for dynamically loaded drivers. For instance, on HP-UX systems, they use the modload command that's part of HP-UX's Dynamically Loadable Kernel Module (DLKM) to install their drivers without requiring a server reboot.

Controlling the migration process

During data migrations, it's important to control the rate at which data is moved and to be able to fail back to the source disks if there's an issue with the migration. Products should monitor data throughput and, through the use of preset policies, dynamically increase throughput during periods of low application activity and throttle back data flow during periods of peak application performance. The ability and speed at which users can fall back to the source disk is important if the target volume doesn't perform or isn't configured as anticipated.

At a block level, Symantec's VolumeManager offers two choices to migrate data: volume mirrors and volume evacuations. First, users may establish a mirror between the source and target volumes. Once the mirror to the target volume is complete, users can break the mirror and allow the target volume to become the primary target for the application.

Users need to account for a few factors when using VolumeManager. First, the write is duplicated at the host bus adapter (HBA) level, so the writes are sent directly from the HBA to the storage targets to eliminate CPU consumption. However, if mirroring over Fibre Channel distances of more than 80 km, the application will need to wait for a write confirmation from both the source and target volumes before processing the next I/O. Second, if users need to fail back to the original volume once the mirror is broken, there could be a significant delay while VolumeManager syncs up the blocks on the source and target volumes.

Turning on VolumeManager's optional Dirty Region Log (DRL) feature allows users to more quickly recover back to the original state of the mirrored volumes after a mirror is broken. The DRL keeps track of blocks that have changed since the mirror was broken and is used by VolumeManager to recover only the blocks of the volume that require recovery if it's necessary to fail back to the original volume.

An alternative method VolumeManager provides allows users to evacuate specific volumes. Users can either manually select which source volume they want to migrate to the target volume or set up VolumeManager to automate the process. When this occurs, all of the data is moved from the source volume to the target volume so that when the migration is complete, no data remains on the source volume.

Obviously, manually selecting which volumes to migrate can be resource and time intensive, but it allows users to selectively migrate volumes when they're not being used by the application. Users may also automate this process with policies provided by Symantec, but volumes may be migrated when they're experiencing a heavy application load, affecting the application's ability to function. Employing this approach also makes it difficult to fail back to the source volume since all data, as opposed to individual volumes, must be migrated back to the source volume.

Softek's TDMF uses a different approach to migrate data between local storage volumes. Prior to copying a block, it places a lock on the block on the source volume, copies that block to the target volume and then releases the block. By default, copies occur every 1/100th of a second in 256KB blocks. However, Sof-tek engineers recommend increasing the block size to 4MB or 8MB and executing the migration during periods of application inactivity to complete the migration more quickly. Once the blocks are migrated, any updates to the migrated blocks are synchronously written to both the source and target volumes.

Once the migration is complete and the volumes are synchronized, TDMF provides Unix users with a command-line option to redirect the application I/O from the source volume to the target volume, or back again if there are problems. Completed without quiescing the application, the switchover command elevates the target volume to the position of primary volume, while in the background TDMF mirrors write I/Os to the old source volume. This allows users to fall back to the original configuration using the same switchover command, provides a period of time to test the new target volume with the application, and lets users choose the exact time to terminate the copies to the source volume.

Topio's TDPS also starts by copying all the data from the source volume to the target volume, although it performs a few tasks differently. First, it copies the blocks in average sizes of about 10KB. Then, instead of synchronously copying writes to both source and target volumes, it makes a copy of the write I/Os and puts them in a buffer. Finally, in the instances where a write is occurring to a block at the same time it's being copied, TDPS suspends the copy, allows the update to complete and then copies the updated block to the target volume.

Because TDPS uses copies of the write I/Os instead of write mirroring, it's necessary to quiesce the application to switch from the source to the target volume. TDPS' "freeze" feature quiesces the application, verifies that all writes have been written to the source volumes, copies the final write I/Os to the target volume and then verifies that both volumes are in sync. Once this is complete, the application can be safely repointed to the target volume.

While quiescing the application temporarily disrupts the migration, it provides a checkpoint to test the data on the target volume. By first verifying the integrity of the data on the target volume and how well it performs using another instance of the application, users can achieve some level of confidence before switching over the production application to the target volume. However, this testing requires users to resync the data on the source and target volumes before actually cutting over the production server to the new volume.

Click here for Data migration products: Proceed with caution, page 2

What are your migration objectives?

Host-level, block-based data migration software products vary in functionality and capabilities. Knowing which product to choose requires a solid definition of the objectives you're trying to achieve. Here are some questions to ask to select the best product for your environment:

*How much downtime is acceptable? Softek Storage Solutions Corp.'s Transparent Data Migration Facility (TDMF) requires no application downtime or quiescence when doing local volume migrations on Unix operating systems. Symantec Corp.'s VolumeManager and Veritas Volume Replicator products may allow users to avoid downtime if VolumeManager is already installed on the server and the synchronous mirror options are turned on. All other block-based migrations require some application downtime, if only for a few seconds.

*How old is the operating system? One of the problems with using host-level data migration products is their lack of support for older operating systems.

*Do you need or want the option to recover quickly from the source disk? All products don't offer the same degree of ability to fall back to the source volume.

*Is a central console needed to manage data migrations across multiple servers? Users who deploy Topio Inc.'s Topio Data Protection Suite (TDPS) will need to ensure that they have a Windows host available to manage migrations on Unix servers.

*Do you need to control the data migration from a local server? Only Softek and Symantec provide a command-line interface, and allow users to control and manage the data migrations from the host on which the agent is running.

Dig Deeper on Storage migration