Hybrid cloud storage has become increasingly popular, but there are some potentially costly errors businesses should...
look out for when they do a hybrid cloud implementation.
Hybrid clouds can significantly reduce capital expenditures by shrinking the amount of on-premises storage hardware, software and infrastructure companies need. In turn, this reduces on-premises operating expenditures. A hybrid approach to cloud storage can significantly reduce the amount of data that is stored in the public cloud because it deduplicates and compresses the data on premises before migrating it to the cloud. Reducing the amount of data stored in a public cloud subsequently reduces monthly costs.
Unfortunately, there are two costly and aggravating mistakes that IT shops can make during a hybrid cloud implementation: Choosing the wrong kind of cloud and using the wrong on-premises storage.
Picking the wrong public cloud
The first common mistake made during a hybrid cloud implementation is choosing the wrong type of public cloud storage. There are six types of public cloud storage:
- Block storage, which is local embedded disk or SAN storage for applications in the cloud that require higher performance.
- File or NAS storage, which is for applications that need NFS or SMB protocols.
- Object storage used for active archiving.
- Object storage used for cool archiving.
- Object storage used for cold archiving.
- Tape storage -- typically a linear tape file system -- which is also for cold archiving.
Each type of cloud storage has distinctive performance characteristics and costs, and choosing the wrong type can have disastrous consequences for a hybrid cloud implementation. For example, block storage has the lowest latency and the highest IOPS and throughput, but it also has the highest storage cost. It can cost as much as 30 times more than active or cool archive storage. Choosing block cloud storage when object cloud storage will do the job is a very costly mistake.
A similar cost issue can occur if a shop inappropriately selects cold archive cloud storage. Cold archive storage is affordable; usually less than 1 cent per gigabyte per month. But if users need access to the data in that cold archive, they may run into some problems. First, it takes a long time to retrieve the data from the cold archive. The first byte of data can take five hours to retrieve. In addition, there are transit fees: The cloud storage service provider charges customers for reading more than a very small percentage of data from the archive. These fees can be as much as 12 times the storage costs.
Avoiding this hybrid cloud implementation mistake requires accurately matching the characteristics of the data to where it will be stored. How frequently will users access the data? What are the performance requirements for reads? What are the data retention requirements? How much data will be kept on premises versus in public cloud storage? Answers to these questions also affect the second common mistake.
Picking the wrong on-premises storage
The second most common hybrid cloud implementation mistake is selecting the wrong on-premises storage. There are four primary ways to deploy hybrid cloud storage systems:
1. Use a primary NAS or SAN storage system that replicates snapshots or tiers data to the public cloud storage based on policy. When tiering, the system leaves a stub locally that makes it appear as though the public cloud storage data is still local.
2. Utilize a gateway or cloud integrated storage (CIS). The CIS looks like local NAS or SAN storage. It caches the data locally while it moves all or most data to the public cloud based on policies. It also leaves a stub that makes data in the public cloud appear to be local.
3. Install an on-premises object storage system that either provides the same de facto interface as public cloud storage or extends to it. When the on-premises object storage utilizes the same interface as the public cloud storage, applications can write to either -- or both -- based on their requirements. When the on-premises object storage system treats the public cloud storage as an extension or remote target of the object store, it replicates data to the public cloud based on policy, similar to NAS or SAN tiering storage to the cloud. If the public cloud uses the same object storage software, then it can become a geographic extension of the on-premises object storage.
4. Continue to use the current NAS or SAN storage system and utilize archiving or backup software that copies data to the public cloud based on policy. Archiving software can also delete local copies of the data based on policy.
Every one of these options has pros and cons and works best with different use cases. Picking the wrong one can have severe consequences. CIS systems tend to be quite cost-effective, for example. Some public cloud storage service providers include them for zero or limited additional monthly cost, which can be a great deal. It can also be quite costly if the amount of data cached locally is less than what applications need. When that happens, the CIS constantly pulls data from the public cloud back to the on-premises storage. There is a large performance penalty from the internet and an additional latency penalty for data rehydration. There is also a high likelihood companies will have to pay transit fees to the service provider for reading out the data from the public cloud.
Disaster recovery (DR) can be problematic for the CIS and tiering storage system options. Data in the public cloud cannot be read directly without reading it through the CIS or an on-premises cloud tiering storage system. That means a duplicate of the CIS or cloud tiering storage system must be made available in the cloud provider's facility or at the DR facility. Several CIS and tiering storage system providers now offer software variations that can run as virtual machines in the cloud or at the DR provider's facilities. Regardless, the additional hardware and software variations add to the cost.
Object storage can be one of the simpler integrations between on-premises storage and a public cloud; however, object storage is not known for high performance. To avoid excessive user complaints about on-premises performance, it is imperative to make sure the object storage system's performance matches. Additionally, object storage systems use the standard Amazon Web Services Simple Storage Service interface, but not all S3 interfaces are the same. Many interfaces are a subset of S3. An application designed for the S3 interface must be certified to work with the subset that the on-premises object storage uses, as well as the one in the public cloud. Otherwise, administrators should expect irritation, aggravation and stress. Troubleshooting this problem takes time, effort and labor.
Backing up or archiving to the cloud can be a significant cost saver, but it can also cause intense heartburn. Sending backups to the cloud is fairly simple, but recovering them may not be. Typically, a backup requires a media server to recover and restore the data. Most hybrid cloud implementations include one or more media servers on premises. That simplifies recoveries and restores on premises, and it makes them much faster than attempting to recover and restore from the public cloud. But when the data is recovered and restored in the cloud, it still requires a physical or virtual media server in the public cloud. If there is no media server in the cloud, then there are no recovery and restores in the cloud. In addition, if the recoveries and restores are coming from one of the variations of object storage archive in public cloud storage, do not expect fast recoveries.
Archiving to create a hybrid cloud is often complicated. The on-premises source storage and the public cloud storage are totally ignorant of each other. Applications and users need to know where their data currently resides to be able to access it. Some archiving software will leave behind a stub; however, links can break. Users might become annoyed or angry at not being able to find their data. Most archiving software can help locate data with admins' help, but troubleshooting is often a time-consuming exercise.
Just like with the public cloud, it is crucial to match the characteristics of the data stored on premises to the ability of the on-premises storage systems to meet them. Shops can avoid mistakes by spending time and effort doing the groundwork upfront.
Common cloud app deployment mistakes
How not to outsource IT jobs