Published: 09 Jun 2008
|Moving data from one array to another, or from one tier of storage to another, is a tedious process that's slowly becoming more automated.
In a perfect world, you would use automated tools to migrate data. Yet data migration is too often done manually: an administrator takes the system offline, backs it up to tape, installs the new array and recovers the data to its new location. A complicated data migration may include 50 or more steps, and take a night or even a weekend to perform. For businesses operating 24 hours a day, 365 days a year, this is simply too long to have a system down.
"Downtime is costly; it costs me $30,000 an hour. That's not really that large an amount, but not having to take the NetApp filer out of service and plan downtime in off hours is beneficial," says Stephen R. O'Neill, VP of technology at Oversee Domain Services, a division of Oversee.net, in Los Angeles. "My engineers don't have to be up in the middle of the night and do all the things you do to mitigate the impact of maintenance."
You would think that migrating data, which is such a routine and tedious operation, would be easier.
According to a survey conducted by the New York City-based research firm TheInfoPro Inc., problems arising from migration include users suffering from extended or unexpected downtime, technical compatibility issues, data corruption, application performance issues, and missed data or data loss.
The majority of data migrations occur when storage equipment comes off lease and is replaced with a new system. Problems migrating data from one array to another or from one filer to the next are compounded by the heterogeneity of the devices and, often, a lack of software to move the data in an organized, automated fashion.
Consider this scenario: Peter Fitch is the infrastructure manager at Rudolph Technologies Inc., a semiconductor services company in Bloomington, MN. He migrated his firms's data from Dell Inc.PowerVault arrays to a Compellent Technologies Inc. Storage Center SAN two years ago. "Back then, we used the old tape backup method," he explains. "We would have had to back up to tape, create a new volume on the tier of storage we wanted to use, restore that data and recreate all the shares as well. Time constraints were a concern; we would have had to do the migration over a weekend if it was a larger, substantial-sized volume of 600GB or 1TB, and we would have had to do that in the off hours and ruin the IT staff's weekend doing so."
Fitch's migration jobs are a lot easier now with Compellent's Data Progression Module (DPM), which moves data according to policies. "We just let it go," he says. "Nobody in the company even knows [the migration] is happening." Fitch says 99.95% of the data migrated by the DPM ends up on Tier 3 storage, leaving Tier 1 for important applications that need to be fresh, such as boot from SAN or live databases.
With Compellent's Thin Import capability, Fitch can move data without writing scripts and without the need for backup software or hardware. "You could use a version of Robocopy and script the move, or a trial version of tape backup software," he says. "We just plug the NetApp filer into the Fibre Channel [FC] connection on the Compellent SAN--which showed up as an external device--and just move the data over." (Robocopy is a command line directory replication tool that's a standard part of Windows Server 2008.)
O'Neill at Oversee Domain Services uses F5 Networks Inc.'s Acopia ARX Series switch to accomplish his data migrations. "I've used the Acopia ARX switch extensively for data migration, typically volume-to-volume or array-to-array migrations. Usually, I'm managing data at the volume level," says O'Neill. "For instance, if I have four to five filers that have a lot of information on them that I want to move to a different filer or I want to move data off a filer while I do maintenance, I would attach the other filer to the Acopia switch and use the rules engine [within the FreedomFabric Network OS] to move the data over."
Host-based software has been the most often-used tool to migrate data. It's best for application-specific migrations such as platform upgrades from Microsoft Exchange 2003 to Exchange 2007, and for database replication and simple file copying. Host-based software such as Symantec Corp.'s Veritas Volume Manager or Brocade Communications Systems Inc.'s StorageX frees the storage array of processing and eases data migrations between heterogeneous storage. It's also more economical than other tools for small-scale migrations, but can become problematic when lots of systems are migrated. Other examples of host-based migration software are IBM Corp.'s Softek Transparent Data Migration Facility, which runs on z/OS, Unix, Linux and Windows servers; Quest Software Inc.'s Storage Consolidator for Windows; and Symantec's Veritas Volume Replicator. All of these software packages can be used to migrate volumes, files or blocks of data.
Array-based software is primarily used to migrate data between homogeneous storage devices and to reduce the impact on host computer operations. Users will likely choose array-based software to move data between generations of a vendor's product. Examples include EMC Corp.'s Symmetrix Remote Data Facility, IBM's Peer-to-Peer Remote Copy and Compellent's Storage Center Thin Import capability.
The scope of array-based software has recently changed. Hitachi Data Systems now offers a controller-based virtualization product with its TagmaStore array that supports the migration of data between Hitachi and non-Hitachi vendor arrays. EMC offers Open Replicator for Symmetrix.
The third type of migration tool used is a network appliance like F5 Networks' Acopia ARX Series switch, Brocade's File Management Engine or Sanrad Inc.'s V-Switch. These devices migrate volumes, files or blocks of data depending on their configuration. For example, the Acopia ARX Series switch migrates file-oriented data between NAS devices and file servers, while the Sanrad V-Switch migrates lock-oriented data.
With a network-based appliance, performance can be improved by aggregating and balancing the migration load across the filers, says Oversee Domain Services' O'Neill. In addition, "if I want to take a filer out of operation for a firmware upgrade, I can migrate the data off the filer, pull the filer out from the back of the Acopia, do the firmware upgrade, put it back in and migrate the data back. There's no disruption to the application," he says.
Barry Thomas is network administrator at the Graves-Gilbert Clinic in Bowling Green, KY, which migrated to a Compellent SAN in January of this year. Thomas needed to use the most complicated approach, migrating between unlike storage devices.
Thomas chose to migrate the data from an EMC array, a Nexsan Technologies Inc. array and local servers to a Compellent SAN he had purchased. "We didn't do a lot of migrating before; we just way-overallocated storage to take care of that. It's expensive," he says. "The few times we had to do that we took the whole system down, moved from one storage solution to another and brought it back up."
The Graves-Gilbert Clinic had three EMC Clariion CX300 arrays, a Nexsan SATAboy and local storage on its servers, says Thomas. "We chose to give a temporary server the original volume and then give it the new Compellent volume. [We] then copied the data over from the old volume to the new volume using the Thin Import capability. If I gave the server a 100 gig volume that only had 80 gigs of data, then I only really consumed 80 gigs of volume space in the new array," he explains.
"We used scripting on one occasion; during the day, I presented a new volume to the server and used the script to basically shut down all the services, copy data from one volume to the next and [then] send me a message when it was done," says Thomas. "I then came in and changed the volume label and it was good to go."
Eric Nelson, director of information technology and CIO at St. Joseph Healthcare in Bangor, ME, used an appliance-based system to accomplish his data migration between heterogeneous devices. The hospital has 140 servers, eight virtual hosts and 94 virtual machines. It manages 14TB of data with a Sanrad V-Switch cluster, 28TB of data on a Hewlett-Packard (HP) Co. StorageWorks Enterprise Virtual Array (EVA) and 15TB on an EMC Symmetrix. To avoid vendor lock-in and redundant SANs at both of the hospital's sites, Nelson used a Sanrad V-Switch to migrate data from the EMC array to the HP StorageWorks EVA.
"Being that they were dissimilar SANs, I couldn't replicate between the two of them," says Nelson. "The only options vendors were giving me was to buy another one of their SANs and then they would set me up for replication. That's pretty expensive."
According to Nelson, "Sanrad was able to do asynchronous replication between different systems. We moved 13TB from the EMC Symmetrix to the HP EVA 8100. The migration was different depending on the application we used. We migrated the virtual machines with VMware tools. We had some issues with our file servers; for those migrations, we used backup and restore operations. At the same time, we were creating four different Microsoft clusters. We took those systems offline, backed up the data and then restored it to the new systems."
Rudolph Technologies' Fitch regularly performs data migrations between multiple tiers of storage (see "Old data isn't the only candidate for tiering," below). "We have two different tiers of storage: Fibre Channel and Serial ATA drives," says Fitch. "The Compellent Data Progression Module that's part of Storage Center lets us automatically move data between Tier 1 and Tier 3 storage nondisruptively."
Fitch just sets a threshold for the number of days of nonuse for data. "Image files that aren't refreshed frequently are moved down to Tier 3 storage automatically. Only frequently accessed files stay up on our large, expensive Fibre Channel drives," he notes.
Data migrations are a fact of life and automating the tedious process can be well worth the effort. Users have found that host-based software is perhaps the best for application-specific migrations such as database replication. It frees the storage array of processing and can often be used more easily for migrating data between heterogeneous storage.
Array-based migration, such as that used by Compellent Technologies' customers, has also proven to be popular, and legions of customers use EMC's Symmetrix Remote Data Facility to move data between EMC arrays. Many users also rely on their chosen array's data utilities to migrate data from one array to another or to move data between tiers. Array-based migration from one array to another is a popular option not only for technology refreshes, but for heterogeneous storage migration. Finally, the use of a network appliance provides additional flexibility for moving file-, block- or volume-based data among heterogeneous devices.