The archiving appliances product class may be a relatively small one, but there are different types of purpose-built...
appliances to consider. One way to categorize them is by their target audiences. The more costly, scalable and highly available disk archiving appliances take aim at large companies, while the less expensive ones are typically email archiving appliances that hold greater sway with small- and medium-sized businesses (SMBs). High-end appliances generally require software to capture and move the data from the source application, such as Exchange Server to the back-end storage. By contrast, the low-end appliances tend to bundle together the pieces that users need, including data capture.
Traits that tend to distinguish an archiving appliance from a mere product bundle are a holistic, special-purpose, service-oriented approach and an overarching management stack developed specifically for the device.
"Very often, the term archiving appliance applies to an email archive stack plus a little bit of hardware that's been consolidated and shipped as a unit. Those are more typically intended to cover the low end of the market space. They're not very customizable, but that's OK. That's what the customer wants," said Craig Butler, business line manager for archive solutions at IBM. "What we run into as soon as we get above 1,000 seats in an email archive configuration are customers who want to customize things. They want it a different way than the appliance allows."
Archiving appliances include capabilities such as write once, read many (WORM), data-retention management, indexing, search, encryption, single-instance storage or more sophisticated data deduplication. Special systems for medical archives cater to the healthcare industry.
Many appliances have a sophisticated type of file system, whether object-based or clustered, noted Rick Villars, vice president of storage systems and executive strategies at IDC.
"It's not just about the file system. It's a lot about the value-added functions," Villars said. "There are efforts to try and make these more open development environments, because it's really hard to do this in a way where one vendor has to be an expert at every single one of these capabilities" such as new compression techniques.
Villars added that processing power is more important than disk performance because of the need to index and search on a constant basis.
"The biggest problem with appliances is scale. If email volumes or content volumes increase rapidly, then are you going to be able to process them, analyze them and archive?" said Brian Babineau, a senior consulting analyst at Enterprise Strategy Group.
"The other vector of scale is capacity," Babineau added. "Can it store everything that you need to store for the time frame that you need to store it in? The worst thing that can happen is you have to keep buying individual appliances, and that becomes problematic. It requires some manual labor. You have to do some configuration work."
Here's how the most prominent disk-based archive appliances stack up:
High-end disk archiving appliances for large companies
The list of high-end disk-based appliances that can generally archive multiple content types, from files to email, includes Dell Inc.'s new DX Object Storage (released in May), EMC Corp.'s Centera, Hewlett-Packard (HP) Co.'s Integrated Archive Platform (IAP), Hitachi Data Systems' Hitachi Content Platform (HCP), IBM's Information Archive (IA) and Nexsan Corp.'s Assureon.
The most recent update to EMC's market-dominant Centera added the Virtual Archive management layer, enabling customers to aggregate multiple instances of the product into a single Centera cluster, even when geographically dispersed, noted Steve Spataro, director of Centera product marketing at EMC.
With its introduction of Centera in 2002, EMC sought to establish a market for content-addressable storage (CAS), also known as content-addressed or content-aware storage. CAS offerings distinguished themselves through their use of a hash function algorithm to assign a unique identifier, or digital fingerprint, to each piece of data or stored object. If an object/piece of data changed, it received a new unique identifier or content address.
Using a CAS product in conjunction with archiving software that was integrated and certified to work with it, an IT shop could configure its systems for email archiving, for instance, and make sure that no messages could be changed or deleted, making the technology ideal for regulatory compliance or litigation holds.
The elimination of the traditional file system, with its capacity limits and management challenges, enabled the products to potentially handle huge amounts of data, although vendors such as Hitachi Data Systems front their products with file systems to return a traditional file handle to an application rather than an obscure hash identifier.
As Hitachi Data Systems and other vendors took variant approaches, the technology's defining traits eventually declined in importance. Even its single-instance store capability lost some luster with the introduction of data deduplication technologies. Prominent market research firms such as Gartner Inc. ultimately stopped viewing CAS as a product category and instead evaluated products such as Centera based on the user problems they aimed to address.
"EMC was huge in promoting CAS, but really, CAS is nothing but an implementation detail that almost nobody should ever be concerned with," said Russ Fellows, a senior partner at Evaluator Group Inc. "CAS is an overused term."
EMC has taken its share of hits over the years and worked to address concerns on several fronts, including performance and proprietary technology. The company argued that its Centera API was available to anyone via the Web, yet it also worked with other vendors on the eXtensible Access Method (XAM) standard to help developers connect applications to object-based storage systems.
Meanwhile, vendors such as Hitachi Data Systems promote their support for a wider range of standard protocols, including NFS, CIFS, HTTP and REST. "It is not a 'black box' where data can only enter via deep integration with applications and APIs," claimed Jeff Lundberg, senior product marketing manager of file and content services at Hitachi Data Systems, in an email interview.
IBM also touts its support of interfaces such as NFS. Its Information Archive, which became generally available in February, is the successor to its DR550, but the new product is a more scalable "complete rewrite," replete with different servers and storage, according to IBM's Butler. He said IBM has already withdrawn the large version of its DR550 from the market and expects to do the same with the low-end version by year's end, although it will continue to support both products.
Another distinction in the new Information Archive is its ability to work with multiple content types at the same time, using different sets of policies, under a single point of management. Information Archive permits the user to have up to three collections of data, such as email, database and file archiving. With the DR550 and some rival offerings, all of the data must either be locked down for compliance or completely open to the archiving application to add and delete data, Butler said.
"You couldn't have some one way and some another way," he said. "We had a lot of customers who wanted the ability to deal with multiple kinds of data sets."
Although Information Archive doesn't ship with an email or database archiving software stack, IBM sells general-purpose archiving software called Content Collector and database archiving software called Optim.
Hewlett-Packard also licenses its own connectors for email, file and database archiving, and plans to add more connectors over time, according to Manu Chadha, director of archiving products and strategy for information management at HP.
"HP and IBM both are very focused on selling their software solutions in addition to the storage repository in order to drive the content to those devices," said Sheila Childs, a research director at Gartner.
HP also promotes the scalability of its Integrated Archive Platform appliance (formerly known as Reference Information Storage System or RISS), claiming it can range from 2.1 TB to 450 TB in a single system and grow exponentially through its grid-based federated approach.
Nexsan also emphasizes the federated approach in its Assureon appliance. Bob Woolery, senior vice president of marketing at Nexsan, claimed via email that the product can run in single-appliance node or scale to hundreds of nodes that automatically federate into a fault-tolerant cluster through next-generation CAS technology called Federated CAS Archive. He added that the federated design makes the product suitable for cloud-based hosted storage.
But that wasn't the main attraction for Joe Funaro, director of IT at Lenox Hill Radiology & Medical Imaging Associates PC in New York City, when he compared Assureon with competitive archive appliance offerings. Cost was the big factor for him.
Making the shift from tape to disk, Funaro said he favored an appliance over alternatives because he wanted proven technology that would be easy to deploy, as well as a vendor willing to stand behind its products. Doctors had already experienced occasional frustration while waiting a half hour or more to access X-ray, CT and MRI images.
"I didn't want to go out on a limb," he said, "and recommend something that wasn't going work."
Regulatory compliance was one consideration, but equally as more important were reliability, speed and efficiency. Lenox Hill's GE Medical Systems Picture Archiving and Communication System (GE PACS) handles some two years' worth of nearline storage of its medical images. The archive holds more than 150 million files, according to Funaro.
A script runs on a nightly basis to copy the files and store them to a pair of redundant Assureon SATABeasts, each with 44 TB of usable capacity. To boost performance, Nexsan recommended the installation of a staging server, running its File System Watcher (FSW) client, between the GE PACS and the back-end storage archive. GE and Nexsan did additional integration work to further enhance performance.
Doctors are now typically able to access archived images in a minute or less. Lenox Hill's SATABeasts are currently at 80% capacity, and the company plans to add another during the coming year, Funaro said.
"It used to take 20 minutes for us to find the tape and another 10 to 15 minutes to restore the tape, so before the doctor had it up on the screen, it was a half hour," Funaro said. "With this, there's no human involvement."
Low-end disk-based archive appliances for SMBs
A rchive-appliance demand is greatest for email, and the list of disk products targeted at SMBs includes ArcMail Technology's Defender, Barracuda Networks Inc.'s Message Archiver, Jatheon Technologies Inc.'s Plug n Comply (PnC) and Mirapoint Software Inc.'s more scalable RazorSafe. In addition, Iron Mountain earlier this year acquired Mimosa Systems Inc. and its archive technology.
The benefits tend to be twofold. Companies are able to retain all emails for the purpose of regulatory compliance, potential litigation or even the simple act of ferreting out an important message that a user may have inadvertently deleted. Plus, offloading messages from a messaging server such as Microsoft Corp.'s Exchange could help to boost performance.
Jeff Grey, IT director at General Agency Services Inc. in Mount Pleasant, Mich., manages approximately 50 users who each receive 2 GB of space on the company's Exchange Server. "The bigger a mailbox gets, the slower it gets, too," he said.
Prime offenders are email attachments, but Barracuda's Message Archiver helps by copying them and leaving a stub linking to the original file, Grey said.
The archiving system currently stores more than 900,000 messages on mirrored 500 GB drives. Grey noted that data compression technology helps the storage work better, and his management console recently showed a compression rate of 48.3%.
"I pay 80 cents per gigabyte for an online backup service," he said. "Anywhere I can save room helps."
Grey liked the fact that Barracuda let him test the product for up to 60 days free of charge. He said the system was easy to install, with a "self-learning" element upon log-in. New features include the ability to archive contacts and calendar items from Exchange.
So far, Grey has encountered one problem: A hard drive went bad, but since the drive was mirrored, he didn't lose any data. Barracuda had a new hard drive in his hands the next day. "It's the best tech support I have ever dealt with, and I've been in this industry for 26 years," he said.
Reflecting on the decision to buy an archiving appliance, Grey added, "It brings huge peace of mind. Even if my Exchange Server died right now, I wouldn't lose any email."