Definition

data archiving

By

What is data archiving?

Data archiving moves data that is no longer actively used to a separate storage device for long-term retention. Archive data consists of older data that remains important to the organization or must be retained for future reference or regulatory compliance reasons. Data archives are indexed and have search capabilities, so files can be located and retrieved.

Archived data is stored on a lower-cost tier of storage, reducing primary storage consumption and the related costs. An important aspect of a business's data archiving strategy is to inventory its data and identify what data is a candidate for archiving.

Some archive systems treat archive data as read-only to protect it from modification, while other data archiving products enable writes as well as reads. For example, WORM -- write once, read many -- technology uses media that is not rewritable.

Data archiving is most suitable for data that must be retained due to operational or regulatory requirements, such as document files, email messages and possibly old database records.

Data archiving benefits

The greatest benefit of archiving data is it reduces the cost of primary storage. Primary storage is typically expensive, because a storage array must produce a sufficient level of input/output operations per second to meet operational requirements for user read/write activity. In contrast, archive storage costs less, because it is typically based on a low-performance, high-capacity storage medium. Data archives can be stored on low-cost hard disk drives (HDDs), tape or optical storage that is generally slower than performance disk or flash drives.

Archive storage also reduces the volume of data that must be backed up. Removing infrequently accessed data from the backup data set improves backup and restore performance. Typically, organizations perform data deduplication on data being moved to a lower storage tier, which reduces the overall storage footprint and lowers secondary storage costs.

Data archiving vs. backup

Data archives are not to be confused with data backups, which are copies of data. Although both are considered secondary storage and use a lower-performance, higher-capacity storage medium than primary storage, they serve different purposes. Archives serve a data retention purpose, whereas backups are used for data protection and disaster recovery.

Data archives can be thought of as a data repository for data that is infrequently accessed but still readily available. Backups, on the other hand, are part of a data recovery mechanism that can be used to restore data in the event that it is corrupted or destroyed. Backup data often consists of important information that must be restored quickly when lost or deleted.

Online vs. offline data storage

Data archives take several different forms. Some systems make use of online data storage, which places archive data onto disk systems where it is readily accessible. Archives can be file-based or object storage-based.

Other archival systems use offline data storage in which archive data is written to tape or other removable media using data archiving software, rather than being kept online. Because tape can be removed, tape-based archives consume far less power than disk systems. This translates to lower archive storage costs.

Cloud storage is another possible archive target. Amazon Glacier, for example, is designed for data archiving. This method is inexpensive but requires an ongoing investment. In addition, costs can grow over time as more data is added to the storage cloud. Cloud providers usually store archived data on tape or HDDs.

Data archiving and data lifecycle management

The archival process is almost always automated using archiving software. The capabilities of such software vary from one vendor to the next, but most archiving software automatically moves aging data to the archives according to a data archival policy set by the storage administrator. This policy might also include specific retention requirements for each type of data.

Some archiving software will automatically purge data from the archives once it has exceeded the life span mandated by the organization's data retention policy. Many backup software and data management platforms have added archiving functionality to their products. This can be a cost-effective and efficient way to archive data. However, these products might not include all the functionality found in a dedicated archive software product.

Archiving for regulatory compliance

Some businesses are required to retain data for certain lengths of time due to regulatory compliance. Whether mandated by industry regulations or government legislation, staying within compliance guidelines is a prevalent business concern. Penalties for violating compliance can include payments for damages, fines and voided contracts.

Data archiving helps businesses meet compliance both by storing data long term and by consolidating data for easy access in case of an audit. The rules dictating the length of time for which data must be retained, where it can be stored and who has access to it vary by industry and the type of data that businesses in that industry generate.

Some examples of regulations with which organizations might need to be in compliance include the Sarbanes-Oxley Act, Health Insurance Portability and Accountability Act and General Data Protection Regulation.

This was last updated in October 2023

Continue Reading About data archiving

Archive vs. backup and why you need to know the differences

7 data archiving best practices for backup admins

How to choose a long-term data archiving services vendor

Dig Deeper on Archiving and tape backup

Disaster Recovery

When is a change to multi-cloud the right resilience move?
A multi-cloud strategy has numerous benefits, including enhanced resilience. However, it can also bring in several technical and ...
New SIOS console enables high availability visualization
IT generalists on Linux systems can avoid the complexity of HA management for mission-critical apps or databases with a new ...
4 disaster recovery plan best practices for any business
Disaster recovery plans are unique, built around an organization's size, type and industry. However, there are some key best ...

A guide to AWS storage cost
Amazon S3, FSx, Elastic File System and Elastic Block Store have different price structures, but data use remains the top factor ...
Panasas embraces software-defined storage, rebrands as Vdura
Panasas is now Vdura, a software-defined storage company that aims to make its cluster parallel file system work on a swath of ...
Dell Technologies World 2024 news and conference coverage
Explore this updating guide on Dell Technologies World 2024. The show will shine a major spotlight on AI, but also cover topics ...

Kaseya Connect Global brings Partner First Pledge, new service
Kaseya's pledge to partners promises more pricing controls and a new subscription service that lets MSPs manage and secure their ...
Gartner's IT services forecast calls for consulting uptick
IT service providers could benefit from a less-constrained tech purchasing climate as enterprises seek to bolster in-house skills...
Use of AI in business drives increase in consulting services
The slice of businesses using external providers for AI uptake is set to more than double, according to U.S. Census survey data. ...

Close