The term archiving can be used in different contexts. Its use across vertical markets and in practice leads to confusion and communication problems. Working on strategy projects with IT clients has led me to always clarify what archive means in their environments. To help this out, here are a few basics about what we mean when we say “archive.”
Archive is a verb and a noun. We’ll deal with the noun first and discuss what an archive means depending on the perspective of the particular industry.
In the traditional IT space such as commercial business processing, etc., an archive is where information is moved that is not normally required in day-to-day processing activities. The archive is a storage location for the information and typically seen as either an online archive or a deep archive.
An online archive is where data is moved from primary storage that can be seamlessly and directly accessed by the applications or users without involving IT or running additional software processes. This means the information is seen in the context in which the user or application would expect. The online archive is usually protected with replication to another archive system separate from the backup process. The size of an online archive can be capped by moving information based on criteria to a deep archive.
A deep archive is for storing information that is not expected to be needed again but cannot be deleted. While it is expected to be much less expensive to store information there, accessing the information may require more time than the user would normally tolerate. Moving data to the deep archive is one of the key areas of differentiation. Some online archives can have criteria set to automatically and transparently move data to the deep archive while others may require separate software to make the decisions and perform the actions.
In healthcare, information such as radiological images is initially stored in an archive (which translates to primary storage for those in the traditional IT space). Usually as images are stored in the archive, a copy is made in a deep archive as the initial protected copy. The deep archive will be replicated as a protected copy. Based on policies, the copy in the archive may be discarded after a period of time (in many cases, this may be one year) with the copies on the deep archive still remaining. Access to the copy on the deep archive is done by a promotion of a copy to the archive in the case of a scheduled patient visit or by a demand for access due to an unplanned visit or consultative search.
For media and entertainment, the archive is the repository of content representing an asset such as movie clips. The archive in this case may have different requirements than a traditional IT archive because of the performance demands on access and the information value requirements for integrity validation and for the longevity of retention, which could be forever. Discussing the needs of an archive in this context is really about an online repository with specific demands on access and protection.
As a verb, archive is about moving information to the physical archive system. This may be the actual application that stores the information in the archive. An example of this would be a Picture Archiving and Communications System (PACS) or Radiology Information System (RIS) system in healthcare. In other businesses, third-party software may move the information to the archive. In the traditional IT space, this could be a solution such as Symantec Enterprise Vault that could move files or emails to an archive target based on administrator set criteria.
As archiving attracts more interest because of the economic savings it provides, there will be additional confusion added with solution variations. It will always require a bit more explanation to draw an accurate picture.
(Randy Kerns is Senior Strategist at Evaluator Group, an IT analyst firm).