cloud archive

Contributor(s): Dave Raffo

A cloud archive is a storage as a service for long-term data retention. The archive holds data that is infrequently accessed, and may be optimized for security and compliance with data regulation policies.

Archiving was considered an early killer app, and it was among the first popular use cases for cloud storage for several reasons:

  • Storing archived data in the cloud can be cost-effective when compared with storing and maintaining large amounts of nonessential data in-house.
  • Using the cloud alleviates the need for buying and upgrading on-premises disk or tape hardware systems and archiving software to manage and store nonprimary data.
  • Archived data rarely has to be brought down from the cloud, a process that can be time-consuming and expensive.

Cloud archiving is often done completely in a public cloud, although there are hybrid setups where data that may require faster access is stored on premises with only rarely accessed cold data moved off-site.

Content Continues Below

Public clouds require no special on-premises hardware or software. An organization can reduce its data center footprint and use less power and cooling resources by storing data in the cloud. Public cloud archive attributes include elasticity, abstraction (it doesn't matter to the customer if data is stored on tape or disk), durability and cost.

Popular public cloud archiving services for cold data, such as Amazon Glacier and Google Cloud Storage Nearline services, store data for as little as a penny per gigabyte. However, some low-cost services take hours to restore data. There may also be extra costs to transfer data back out of the cloud.

George Crump, president of analyst firm Storage Switzerland, explains how tools and cloud storage apps can help move data to the cloud.

Cloud archiving vendors

Cloud gateways are frequently used to help move data into the cloud in the right format. These gateways are sold by vendors, including Amazon, Ctera, EMC, Microsoft, Nasuni, NetApp and Panzura. When recovering data, the cloud may need to support the application used to create the data. Retrieval times may also vary. Glacier, for example, may require more than three hours to restore data. Google Nearline has a response time of approximately three seconds.

There are specialized enterprise cloud archiving vendors that usually focus on vertical markets. For example, Proofpoint has an archiving service for the financial industry for email, documents, instant messages, social media and other forms of electronic communication. Mimecast archives email and files for industries such as healthcare, legal and manufacturing.

When looking for a cloud archiving provider, organizations should consider the providers' service-level agreement for data recovery, what tools are available to find data when it is needed, whether the cloud has a self-service portal, if the cloud meets all the customers' compliance requirements and if the application that stores the data is supported.

Cloud archiving security

As with any other archiving product or application, a cloud archive service must provide secure storage optimized for long-term Data retention that complies with data regulation policies. For security in flight, data moving in and out of the cloud is managed through secure HTTPS protocols. Most providers can also encrypt data stored in their clouds. Customers can add their own encryption keys for an extra layer of security or encrypt data before sending it to the cloud.

An archive in the cloud must be easily searchable, protected from tampering or overwriting, and allow easy access to specific data when it is required for a compliance audit or E-discovery.

Cloud vs. tape

The cloud is an alternative to on-premises tape, which is frequently used to archive data for long-term retention. To replace tape, a cloud archive must match tape's low cost, longevity, scalability and security. Tape has the advantage of portability; it can be shipped across locations without the need to rewrite data. Cloud's advantages are geographical redundancy that mitigates the risk of data loss from hardware failures, advanced search capabilities and the elimination of costs from technology refreshes.

Cloud archive vs. cloud backup

A cloud archive should not be confused with a cloud backup.

Just as there are differences between on-premises archive and backup, cloud archive and cloud backups are not the same thing. Backup involves copying data at regularly scheduled intervals, and often involves data that has changed. Cloud archiving moves data off-site once and that data will not be changed after it goes to the cloud. Archiving is usually done to free up storage space for more frequently accessed data.

This was last updated in May 2016

Continue Reading About cloud archive

Dig Deeper on Cloud object storage

Join the conversation

1 comment

Send me notifications when other members comment.

Please create a username to comment.

When do you see a cloud archiving as the preferred option to an on-premises archive for cold data?

File Extensions and File Formats

Powered by: