Storing information for long periods of time is fraught with peril.
Near-term protection threats are enough of a challenge with bit errors, drive failures, cyberattacks, human error and natural disasters -- just to name a few. Long-term preservation, however, adds to those challenges, with changing hardware architectures, software platforms, applications and data formats.
In addition, new demands for increased accessibility, collaboration and big data analytics are not only opening up questions about how long data should be stored, but also available. Today, there are a number of innovations helping to ensure long-term preservation and accessibility of data.
Out with the old
For long-term preservation, organizations often turn to tape media. Of course, this format is by no means infallible, and its limitations have been discussed ad nauseam.
Recent innovations have helped extend the life of tape media to better enable long-term data preservation, specifically LTO technology and the introduction of Barium Ferrite (BaFe) tape media. Testing provided by manufacturers suggests that BaFe tapes can last up to 30 years without suffering from degradation due to demagnetization. Despite increased durability, tape media can still fail, for example, from mishandling or poor environmental storage. Redundancy and read verification can help, but add cost as capacity expands.
However, preserving data is only one piece of the puzzle. The data must also be kept in a readable format. With tape formats continuing to evolve over time, it is not uncommon for organizations to retain old tape drives along with the tapes to ensure that the data can be read at a later date. Additionally, it is a common practice to archive installation media for older applications as well. These practices can help organizations preserve data longer. But if you need rapid access to data, tapes sitting in a vault somewhere aren't going to cut it.
In with the new
Software-defined storage (SDS) may offer an alternative -- or at least play a part in an alternative. While there are multiple flavors of SDS, a common theme of the technology is hardware abstraction and the ability to present a consistent interface to data regardless of the hardware underneath it. Some products can continue to present this consistent interface even as the underlying hardware infrastructure evolves, without data migration. However, not all SDS products enable this, so it serves as an important distinction to identify during technology evaluations.
For example, an SDS product that supports deployment on standard x86 architectures and pooling of multiple generations of hardware together can provide significant value when attempting to maintain data online for long durations. While nothing lasts forever, some form of the x86 architecture has been around for nearly 40 years. Of course, no one knows if x86 technology will be around for another 40 years, but it is one of the more consistent standards in the industry.
The unique aspect that SDS provides is infrastructure flexibility. The greater the number of hardware options supported, the higher the likelihood that the product can adapt as technology evolves. Continuing the theme of flexibility, some storage architectures are expanding the hardware abstraction layer to include public cloud storage or even tape storage. The key takeaway is that greater flexibility equates to more adaptability as technology changes over time.
Another capability to consider is scalability. Some file- or object-based storage products already support tens, if not hundreds, of petabytes of storage capacity. When evaluating a storage system that will meet storage demands for the next decade or so, look for massive scalability. Also, look for advanced data protection features such as the ability to rebuild to free space and self-healing capabilities.
Not all SDS architectures are the same
While SDS offers a number of benefits that align with long-term archives, there are still a number of considerations to take into account. Initially, understand that all SDS architectures are not the same. For massive content repositories, limit any investigation to either scale-out file system or object storage architectures. Products based on traditional file systems or block storage architectures will likely not provide the requisite scale. In some cases, even scale-out file systems may not be enough. The hierarchical nature of file systems, even scale-out file systems, can limit the ability to scale to massive multi-petabyte limits. However, even when an SDS architecture can support capacity scaling for the distant future, that level of capacity remaining online with underlying hardware, power, cooling, and space requirements may be too much for an organization to bear.
Many SDS products, especially those in scale-out file systems or object storage, tend to specify focus on active archives, with the underlying assumption being that this data is content that the business has decided should remain active and online. However, for many IT organizations, the distinction is not that easy. Lacking the proper tools and data analysis, an organization may keep more data online than necessary. The organization may also take the opposite approach -- put everything on tape, and hope that events where that content is needed are so rare the impact is minimal.
In response to this complexity of archive storage, some products are starting to emerge that can act as a virtual layer over both an active-archive object storage and traditional tape archives. The result is a single storage pool spanning both online and offline media. The number of these offerings is still relatively small compared to the number of dedicated online or offline archive products. The extension of hardware abstraction to include offline media, however, could represent a potential growth area in the years to come.
SDS products and the hardware abstraction they provide can help to provide long-term data preservation and accessibility. While SDS still does not solve all of the challenges of keeping large amounts of data available for long periods of time, it is another tool in the IT arsenal, an added layer of flexibility helping IT organizations evolve with the technology landscape.
Tape's role in data preservation
Enterprises embrace archiving data in place
Data growth and preservation puts pressure on new storage systems