sumetho - Fotolia

News Stay informed about the latest enterprise technology news and product updates.

Amazon Glacier’s cold data storage lead challenged

Google Cloud Storage Nearline takes on Amazon Glacier cold data storage, promising restore times in seconds rather than the hours it takes for Glacier customers.

Like Amazon Glacier, Google's new cold storage cloud service stores data at a penny a gigabyte per month. The difference is that Google is promising restores in seconds while Amazon takes hours to restore data from Glacier.

Google Cloud Storage Nearline went into beta this week. Google said its low-cost archive cloud will let customers retrieve data in three seconds. That addresses the biggest drawback of Amazon Glacier, which has a restore time ranging from three hours to five hours.

Google Cloud Storage Nearline and Glacier are designed for companies that generate multiple terabytes of data and want to store it cheaply. This type of "cold" data is rarely retrieved, and organizations typically store it on tape. Google's short retrieval time also makes its Nearline service a viable option for disaster recovery.

According to a Google blog post, "Nearline Storage is appropriate for storing data in scenarios where slightly lower availability and slightly higher latency, typically just a few seconds, is an acceptable tradeoff for lower storage costs."

Amazon Web Services (AWS) launched Glacier cold data storage in August 2012 as a low-cost service for archived data that's rarely accessed but needs to be retained for long periods. This is different than Amazon Simple Storage Service, which is for data that needs to be accessed in real time.

Google Nearline Cloud Storage and Amazon Glacier have the same low cost. Google, which is moving into nearline storage 30 months after the arrival of Glacier, is trying to make up for lost time with superior response times. Now we'll see if Amazon reacts by lowering Glacier's retrieval time.

"That is a pretty competitive difference," said Mike Matchett, senior analyst at Taneja Group. "It's a direct shot at Glacier, which means Amazon will have to do something soon."

The difference in Google's and Amazon's data retrieval times raises the question of what type of storage they use for these cold data storage services.

The Google blog stated customers "should expect 4 megabytes per second of throughput per terabyte of data stored as Nearline Storage. This throughput scales linearly with increased storage consumption. For example, storing 3 TB guarantees 12 MB of throughput, while storing 100 TB of data would provide users with 400 MB of throughput."

"The question is, how are they doing that?" Matchett said. "No one really knows what is inside Amazon and Google for sure. They probably built their own architectures. There is some speculation. Glacier looks and acts like tape, cheap to store but expensive to recover. Google acts like a distributed object store."

Matchett said Google also preserved the same storage API for the Nearline cold data storage service as for its other classes of storage. AWS requires different APIs for Glacier, which means applications have to adjust depending on which storage they are using.

Next Steps

Deciding between public and private clouds for your cold data

A deep dive into archiving data in the cloud

Cold storage needs heat up

Dig Deeper on Public cloud storage

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

Will your company abandon Amazon Glacier cold data storage for Google Cloud Storage Nearline?
This is what we all want, low cost storage, fast restore, I have been a cold storage user for the past 3 years (Glacier via Zoolz) and I am very happy with the solution built on Glaicer, so many helpful features and options, this can be something better according to the restore time, I hope I will be able to use newline via Zoolz.
Hours ? That's totally crazy.
Well, both AWS and Google do not provide any detailed information on their "cold" storage services. The suspicion has been that AWS Glacier does use tape and that this may account for the delay in getting the data ready for download from Glacier. Google Nearline appears to rely on HDDs in an object storage cluster judging by the response time. The challenge for Google is they have no ecosystem of third party solutions that work with Nearline while AWS has a large ecosystem of solutuions that work with both S3 and Glacier. Both AWS and Google know how to provide data storage at web-scale. The competition will be based not only on price but on how well their storage services can be leveraged by third parties.
It sounds like a good idea on paper, a way to get market share, but I am skeptical if it can live up to its hype.
The key with both of these is the SLA offered, and the penalties associated with those SLAs. They have to be something worth the risk.