Cloud data migration is one of the biggest hurdles facing enterprises that want to take a stake in the cloud. Particularly when working with large files or big data sets, moving from on-premises to the cloud takes an enormous amount of bandwidth -- so much so that one of the more popular methods today is physically mailing media to the cloud provider.
"We think of this cloud as this great place and you can just put data up there using your mobile phones and everything else, but because of bandwidth, it's limited," said Mike Matchett, senior analyst at Hopkinton, Mass.-based firm Taneja Group.
According to Matchett, there are more options for smooth cloud data migration available today than there were in the past. For example, software such as Attunity Replicate uses parallel bandwidth to speed up the cloud data upload and download times. In addition, many array vendors are beginning to include data migration technology in their products that automatically moves cold data to the cloud in a hybrid setup.
Mike MatchettSenior analyst, Taneja Group
But it's not just the process of cloud data migration that's giving pause to IT professionals – they're also grappling with the decision of which data to migrate. According to Matchett, security concerns and compliance issues means not all data is fit for the cloud. Quantity and size of data comes into play, which is often of little concern when housed on-premises. Performance of data is also an important factor -- if access to data truly needs to be low-latency, it may be better off residing on-premises.
"I think if you do a careful analysis of data and where it best lives and then look at the different kinds of hybrid architectures you build, you'll be better off," said Matchett.
Transcript - Considerations for cloud data migration and security
Do enterprises have to take the same security precautions with hybrid cloud as they would in a public cloud?
Mike Matchett: Generally, when enterprises go into a hybrid cloud, they have to really work harder on security, because you're crossing security domains. The way the cloud security works -- and the cloud can be more secure than a lot of data centers the way they construct and operate these places even though they're multi-tenant -- you tend to have a cloud user that gets access to the infrastructure.
And within an enterprise, we tend to have LDAP or Active Directory and roles and permissions, and everybody gets their own different user ID and identity access. So one of the problems with the hybrid cloud is really bringing those two kinds of security domains together and making it work. And a lot of work has to go into that federated ID scheme. So that's one of the challenges and one of the things to look for when you're looking to build a hybrid cloud as a good solution for a federated identity.
What are some challenges of moving data from on-premises storage to the cloud?
Matchett: Obviously, in moving data, the bigger the data gets, the harder it is to move. And in fact, one of the interesting points that I learned a while ago was that the more popular way to move lots of data into Amazon is using FedEx and FedExing tapes or disk drives. We think of this cloud as this great place and you can just put data up there using your mobile phones and everything else, but because of bandwidth, it's limited.
If you really want to move data into a cloud scenario, sometimes you still have to use the old methods of shipping media. But there are helpful solutions. Attunity has a product that helps use parallel bandwidth and move data up and down. There are some incremental ways to copy live data sets of synchronous replication, or asynchronous replication can be done across long distances to keep data sets in sync. So there's definitely some ways to approach that, and you do have to think about that data friction.
What would you say are some tips for defining what data in your cloud environment gets moved into the cloud and what stays on-premises?
Matchett: I think if you're going look at just cloud storage...as we started off by talking about the traditional array of vendors are going to start building cloud storage as a tier. So things will auto-tier and eventually migrate down as they get colder and colder, and just go out to the cloud and you won't have to think about it. Today, we have to be a little bit more deliberate about our cloud usage and look at the regulation and compliance factors, for example, on geo-distribution. Maybe I can't put this data in that cloud, or I can't put it there but I have to put it in this region. There's definitely some sensitivity.
There's HIPAA regulations. There's quantity and size. And [there's] access modes and access performance. One of the things we get out of the cloud is the ability to access from different points and we can buy IOPS, but it's really hard to guarantee latency sometimes and performance. So our higher speed data sets might have to be on site. So I think if you do a careful analysis of data and where it best lives and then look at the different kinds of hybrid architectures you build, you'll be better off.