Moving data is a constant struggle for most storage departments. Sluggish applications need to move to faster disks for better performance. Rarely accessed data needs to move in the other direction -- to less-expensive ATA disks, CD-ROM or tape. If not handled judiciously, data migrations can cause application outages and server reboots, resulting in 2:00 a.m. work sessions.
There are many methods at different price points to move data. Some organizations may be able to simply use the move command that comes with every operating system. Others may need standalone utilities and network- and host-based approaches to get data from point A to point B. What's the best way to move data? To determine which method suits your needs, consider the following:
Once you document your storage environment against this list (see "Data migration checklist"), pick a migration tool. Migration utilities operate at the host, network and array level. Each approach comes with its own set of advantages and disadvantages. For example, host-based commands like move should only be used for files not in use. Third-party, host-based utilities like NSI
- Software's Double-Take can help users measure and forecast the impact, time and I/O wait times of a data migration before moving the file. Without interrupting the application, EMC Corp.'s Symmetrix Optimizer utility automates load balancing and data placement within the array to improve application performance.
While some applications allow outages, many times it's simply not practical to shut down an application to perform the migration. The utility you choose should be able to monitor the application, increase and decrease the speed of the data migration or even stop the data migration if the application becomes exceptionally busy. This will allow server resources such as CPU and memory to be diverted from the data migration utility to the application.
A specialized migration tool isn't necessarily more effective in moving data. Host-based software such as Veritas Software Corp.'s Volume Replicator is often used to avoid vendor lock-in. However, host-based software can be costly and difficult to manage, depending on the number of servers, operating systems and arrays involved.
Array-based utilities such as EMC's Symmetrix Remote Data Facility (SRDF) and IBM's Peer-to-Peer Remote Copy (PPRC) tie users to a specific vendor's hardware, but can be administrated by a smaller, well-trained staff with minimal intervention required at the server level during the migration process.
About the author: Jerome Wendt is an independent writer specializing in the field of open systems storage and storage area networks. He has managed storage for small and large organizations in this capacity.
This article first appeared in Storage magazine
This was first published in May 2004