The latest trends in purpose-built data archive appliances include the addition of features such as data deduplication to enable increased efficiency in data storage over the long haul, according to Russ Fellows, a senior partner at Evaluator Group Inc. in Broomfield, Colo.
In this podcast interview, Fellows discusses the evolution of and latest developments in data archive appliances. He also offers advice about when a purpose-built archiving storage system would be a good choice and when an alternative data archiving approach might make more sense.
You can read a transcript of the interview below or download the MP3.
Download for later:
Regulatory compliance fuels data archive appliances
• Internet Explorer: Right Click > Save Target As
• Firefox: Right Click > Save Link As
SearchStorage.com: What are the key defining characteristics of an archive appliance?
Fellows: Archiving is a topic that's received continuing attention, and for good reason, because it can be a large part of the overall cost of maintaining and managing your data. In terms of the defining characteristics for an appliance, archiving can be accomplished by a couple of different methods. Typically, it was done through the use of software, but more recently, appliances have come to pass that incorporate software and hardware, and they also in many cases include regulatory compliance features such as WORM, or write once read many, locks and file-retention periods on a per-file basis. Some of the other features would include drive spin-down, also known as MAID [massive array of idle disks]. That's a very desirable feature in order to keep costs down as you're storing and maintaining data over long periods of time.
SearchStorage.com: Can you describe the different types of archive appliances that are available to corporate IT shops?
Fellows: When it comes to archiving appliances, there are really two main categories. The first category includes those [with] compliance features that I was talking about: WORM locks, file-retention periods on a per-file basis. Others would be more broadly defined as an archiving appliance in the fact that they have a way to manage and store data cost-effectively over a long period of time. These appliances have really come to pass with SATA drives becoming used in the enterprise. SATA drives enabled a much more cost-effective way to store large amounts of data on disk. Until then, the only way to [store large amounts of data] cost-effectively was on tape, which [caused] a big problem when people needed to access the data relatively quickly because the tape recall times can be so long.
So again, to reiterate, the archiving appliances today, as we know them, are almost in all cases disk based, and they may include some type of regulatory compliance features as well.
SearchStorage.com: What are the latest trends in archive appliances?
Fellows: The first trend really has to be data deduplication. With the large amounts of data that are involved and the effectiveness of data deduplication recently, everyone either has data deduplication in their appliances or is finding ways to try to apply it. Quite a few of the original appliances had simpler forms of data deduplication, known as single instancing or single-instance storage, so that if you store multiple copies of the same file, you would actually only store one copy. That's relatively easy to do. But more recently, variable-block deduplication, [which is a lot more effective], has been applied.
Another trend is that companies are looking for ways to incorporate archiving more as a part of their active data management practices. Everyone has known for a while that archiving can save you a lot of money by not having to back up your data constantly. So, as data quantities have increased and costs keep going up, people are looking for ways to be more effective. People are looking to make archiving a more normal part of their practice.
And then I would say, finally, there's also sort of the emergence of more flexible back ends. People are looking to try and find ways to use cloud platforms, either public or private platforms, as some means of being able to store and archive data for long term.
SearchStorage.com: In what scenarios, or for what types of companies, would you recommend an archive appliance?
Fellows: I'd say that there are really two factors that can motivate a company to use an archiving appliance. The first is if they have to comply with some regulation or mandate that requires them to maintain data for some period of time. That's why we saw the emergence of a lot of these appliances about seven or eight years ago when a lot of regulations kicked in and people were looking for quick and easy ways to comply.
The second motivation is really closer to the origin of archiving, and that again is the desire to manage your data effectively. It's less a factor of size than it is the need and ability to manage your data effectively, and, also, if you're under scrutiny for complying with regulations.
SearchStorage.com: In what scenarios would you advise a user to consider an approach other than an archive appliance?
Fellows: There are ways to do archiving more cost effectively. If you don't fall under Sarbanes-Oxley or HIPAA or Gramm-Leach-Bliley or any of the myriad of regulations, and you just want to better manage your data, then you can use some more traditional techniques, which would often include tape as a good mechanism for long-term archiving. Tape is still, today, by far the most cost-effective long-term solution due to the fact that there's very close to zero cost to storing that data for long periods of time. You don't have to pay to power it, whereas with disk-based solutions, you have to maintain disks spinning. So, if you don't have regulatory compliance [issues], there's a lot more flexibility in how you can go about creating an archive.
SearchStorage.com: Is there still confusion in the marketplace about the difference between archiving and backing up data?
Fellows: Yes, I think there's always been some degree of confusion, and that goes back to how these things came to pass. A backup is really a copy of data, and the way archives emerged is people would keep a lot of backups and then at some point they would choose one and say, 'OK, that's my archive,' which really means that's a snapshot or a point-in-time copy of the data, and that really is the only copy of that data. An archive is really the primary copy of data, whereas a backup is a secondary copy. But because of the way that it arose with keeping old backup tapes, the terms got intermingled and confused by many people.
This was first published in June 2010