Managing and protecting all enterprise data


Enterprise-ready VTLs

Enterprise-class virtual tape libraries (VTLs) are an increasingly cost-effective destination for data that needs to be backed up or restored quickly, and isn't quite ready for offsite archiving. But the more complex the storage environment, the more attention users should pay to how the VTL provides scalability, performance, manageability and deduplication.

Most virtual tape libraries will accelerate backups, but there are key differences among them when it comes to scaling, interoperability and management.

For a growing number of organizations, a virtual tape library (VTL) provides a cost-effective addition to their disk-based backup. By storing data to disk rather than tape, the VTL speeds data backup and retrieval without requiring users to change their existing backup processes. This is because, to the backup server and backup application, the VTL looks like a traditional tape library. Benefits include squeezing ever-larger data sets into tight backup windows, retrieving data quickly when required for legal or regulatory reasons, and reducing the operational and reliability risks related to tape.

However, not all VTLs are created equal. The larger and more complex the storage environment, the more attention users should pay to how the VTL provides scalability, performance, manageability and deduplication which, by storing only unique bits of data, can reduce disk capacity and bandwidth needs by as much as 30:1 or 40:1 (see "Enterprise VTL feature checklist," below).

For some users, VTLs may be outright replacements for existing tape libraries, eliminating the cost of manually handling tapes, delays associated with the need to sequentially read through an entire tape to find data, and the resulting breakdowns when tape jams or breaks.

Michael Grillo, principal IT engineer at Foxwoods Resort Casino in Ledyard, CT, is looking forward to using his 80TB of VTL capacity to completely replace tape because of the VTL's increased reliability and speed. He backs up 4.5TB to 5TB each day. "Our next step is to replace tape with a VTL at our disaster recovery [DR] site that's nine miles from the main data center," he says.

Some VTLs are sold as "appliances" with preloaded software that may also be preconfigured for a specific storage server, while others are sold as software that can be installed on generic servers. The advent of deduplication has made disk-based VTLs an affordable fit for very large-scale storage and for data replication to remote sites over WANs.

Enterprise VTL feature checklist

  • Modular upgrades. So you don't have to buy a separate virtual tape library (VTL) for each increase in storage capacity.

  • Deduplication method(s), inline or post-processing, to meet your needs.

  • The ability to set common rules or perform common management functions across multiple VTLs.

  • Support for your backup software and hardware.

  • Reliability features such as clustering and failover.

Enterprise requirements
Definitions of an enterprise VTL environment vary, but vendors and analysts generally agree that an enterprise-class VTL should back up at least tens of TBs of data daily and be able to store hundreds of TBs of data. These high-end VTLs also need to scale easily and support complex backup and recovery environments, such as those that include multiple remote offices or multiple VTLs in the storage fabric.

The ability to easily add disk capacity is important not only because organizations are backing up more and more data all the time, says Lauren Whitehouse, an analyst at Milford, MA-based Enterprise Strategy Group (ESG), but because companies often buy their first VTL to meet the needs of a specific department and then add to it as they see the value it provides.

Because of the number of production servers involved, failed backups are considered unacceptable in an enterprise VTL environment, says Andrei Shishov, VP of backup platforms engineering at EMC Corp. Reliability is often provided through features such as clustered failover among VTL nodes, as well as redundant components such as power supplies within individual nodes.

Performance is another basic requirement for an enterprise VTL, so it can back up data quickly enough to fit within an organization's backup window and restore files quickly when needed, says Whitehouse. Exact definitions of what is sufficient performance vary, although Peter Eicher, director of product marketing at FalconStor Software Inc., estimates the minimum at 300MB/sec to 400MB/sec. In addition to improving reliability, clustering can boost performance by spreading the work of reading and writing data, and/or deduplicating it, among multiple VTL servers.

Because enterprise customers "potentially have not only the local data center, but maybe some remote offices to be concerned about," support for a variety of backup applications is also a must-have, says Whitehouse. Then there's the need, adds Shishov, for the VTL software to work with the widest possible variety of backup software, servers and server components, such as host bus adapters and disk drives.

The desire to cut the purchase and energy costs of disk drives, and to reduce the bandwidth required to replicate data among various sites (such as for DR), have made deduplication a must-have feature for enterprise VTLs (see "Dedupe options," below). Some vendors are also introducing massive array of idle disks (MAID) or spin-down features that power up disks only when they read or write data.

Dedupe options
Deduplication can be done inline (as the data is taken into the virtual tape library [VTL]) or post-processing, after it's written to disk. Inline reduces the amount of disk space required but can slow performance; post-processing requires the most disk space, but can speed backups by allowing the disks to work at full speed as data is written to them.

Many vendors offer their own spin on deduplication, seeking to reduce the amount of time, processing power or disk space consumed by the dedupe process. Diligent Technologies Corp. (an IBM Corp. company) first identifies similarities among the data being backed up, and then submits only those similarities for detailed byte-by-byte deduplication. IBM Corp. will incorporate Diligent's software into its own products, says Tom Grave, Diligent's director of product management. EMC Corp. recently added dedupe functions (licensed from Quantum Corp.) to its VTLs.

Quantum's DXi7500 high-end VTL asks a user to choose parameters such as their backup window and the amount of data they need to back up, and then automatically chooses whether to perform deduplication inline or after the data is backed up, says Mike Sparkes, Quantum's product marketing manager for enterprise disk systems.

While many deduplication vendors boast of deduplication ratios of 20:1, says Sparkes, Quantum's customers report data-reduction ratios of anywhere from 5:1 or 6:1 to 30:1 or 40:1. Just how much space deduplication will save a given customer varies based on the amount of data being stored, how long it's kept and how often full backups are done, as well as the deduplication technology being used.

Ease of management is key to minimizing the VTL's total cost of ownership, and includes everything from the clarity of the user interface to the ability to easily view and manage groups of VTLs from a single console. Specific features to consider are how easy it is to access all VTLs in a group through a single sign-on, and the ability to view a consolidated report for all the systems and make simultaneous changes in common configuration settings.

Security wasn't a key requirement cited by users or analysts because VTLs are generally used in data center environments, which are regarded as secure. Security becomes more important when data leaves the data center stored on tape, so most users say they perform encryption at the tape library and not the VTL. However, some vendors, such as FalconStor and Sepaton Inc., do offer encryption in their VTLs.

Depending on whether a VTL is used along with or as a replacement for tape, users might need it to write directly to or read directly from a physical tape. While most VTLs can import and export tape, says ESG's Whitehouse, another important consideration is whether the backup software's catalog is updated to reflect any changes in the data stored on the VTL so backup administrators can more easily track the location of the backed up data.

Click here
for a comparision of VTLs (PDF).

Choosing a vendor
Choosing the right enterprise VTL depends on a user's existing backup environment, the amount of data to be backed up, the complexity of the storage environment and, most importantly, what problems they're trying to solve.

Dave Russell, research VP at Stamford, CT-based Gartner Inc., ranks EMC, Sepaton (also OEMed by Hewlett-Packard [HP] Corp.) and Sun Microsystems Inc.'s StorageTek Division as the top three "pure-play" VTL vendors. "EMC has the most market share and is a very scalable solution," he says. "Sepaton and HP are extremely scalable, have very high performance and have excellent integration. Sun StorageTek is becoming more of a force in the marketplace, although they are the last" to enter the market, he notes.

In a January 2008 report, Stephanie Balaouras, an analyst at Cambridge, MA-based Forrester Research Inc., wrote that FalconStor "leads for the completeness of its product offering and strategy," while Fujitsu Siemens Computers leads for its "host support, architecture and tape integration." She praised those two and EMC for "the most comprehensive ... interoperability, good scale, solid resiliency features and manageability," due in part to the maturity of their offerings.

FalconStor scored well in areas such as deduplication, tape management, replication and global management, according to Balaouras. She also noted its VTL product is built on its IPStor platform, "which enables not only VTL but snapshots, continuous data protection and IP-based replication." ESG's Whitehouse gives Diligent Technologies Corp. (an IBM Corp. company), FalconStor and Sepaton the highest ranks for ease of management.

NetApp Inc., Quantum Corp. and Sun StorageTek "are two to three product updates away from closing the gap" with the leaders in areas such as clustering, deduplication and replication, according to Balaouras.

Balaouras describes Copan Systems Inc. and Sepaton as "massively scalable VTLs" that are focused primarily on customers who want to replace, rather than complement, their tape libraries. Copan recently announced the option of 1TB drives for its Revolution 300 Series Platform, a "hot standby" deduplication option that provides a spare deduplication engine that seamlessly replaces a failed unit, as well as a new 40-drive VTL cache option that allows Copan systems to support more than 1,000 concurrent data streams.

ESG's Whitehouse gives Sepaton's grid architecture the highest marks for scalability, followed by Diligent and FalconStor.

Data Domain Inc. is a leading deduplication vendor. However, "VTL is the minority of our business," says Brian Biles, co-founder and VP of product management. While its software provides "good support for host operating systems, backup apps and network connectivity," says Forrester Research's Balaouras, "it lacks broad emulation of popular tape drives and tape libraries because of its focus on tape elimination rather than tape integration." For Data Domain, she says, "the VTL interface is simply a means by which to introduce a disk appliance less disruptively into the environment." Sun's StorageTek VTL Plus is based on FalconStor technology, but Balaouras says Sun hopes to use its own technology, such as the Solaris Zettabyte File System and its storage products, "to increase scale and reliability through OS-level clustering and software RAID at an affordable price."

Wanted: Modular growth
John Wunder, director of IT at Magnum Semiconductor in Milpitas, CA, began looking for a VTL after an acquisition caused a seven-fold increase in the amount of data the company needed to back up every night. The 21TB of data was too much to back up within the company's two-hour nightly backup window.

He considered purchasing a deduplication appliance from Data Domain that would dramatically speed the backup process by storing only unique changes to the data. However, at the time of his selection, Data Domain packaged its appliance with a set amount of storage, "so if we wanted to add another appliance it would" create another island of storage, he says.

EMC and NetApp take a similar approach, says Whitehouse, forcing a customer to do a "forklift" upgrade of their entire system or buy another VTL just to get a comparatively small additional amount of capacity. That can result in "VTL sprawl," she says, with "lots of different boxes, each of which has to be managed independently."

Using a Quatrio Pivot 300 VTL running Diligent's ProtecTier data deduplication software, Wunder says he can meet his expected 50% annual growth in storage needs by buying another disk array every year and another server every two years. "If I'm wrong on [the amount of] data growth, I'll just escalate" those expansion plans, he says. Each expansion "is less than $20,000 a pop, a small enough amount of money so I can double, triple or even quadruple the size" of the VTL if needed, he adds.

Wunder uses the VTL to store data before moving it to a Quantum Corp. LTO-3 tape library for long-term offsite archiving and DR. This will save the company $300,000 over four years vs. the cost of building and maintaining a tape library large enough to meet its future needs, he says.

Wunder wishes he could have changed his company's centralized approach to storing data on one NAS filer in one file system (which he inherited from his predecessors). "If I just had this 21TB stored across three or four" storage systems, he says, he could back up data to multiple VTL servers at once, further reducing his backup window.

A number of vendors are combining 4Gb/sec Fibre Channel with multithreading capabilities to do exactly that, says Whitehouse. Sepaton, for example, "can stack any of a number of their nodes in one solution," she says. Each node "might have two or four Fibre Channel connections, so the scalability becomes easy to map out."

Magnum Semiconductor's Wunder strongly advises users to build extra capacity into their VTL arrays--"a minimum of double of what you need [now]," he says--to allow enough extra space to accommodate not only regular backup storage, but extra space to test applications and DR plans. "I can't test if I don't have a staging area," he says.

Continued growth
Few analysts or vendors see VTLs replacing actual tape libraries anytime soon. But enterprise-class VTLs are an increasingly cost-effective place for data that needs to be backed up or restored quickly, and isn't quite ready for offsite archiving. VTLs will only grow more popular as customers' data storage needs grow, backup windows shrink, and as regulators or lawyers insist that critical data be ready for quick access.

VTL vendors need to add new features to their software so that they, and not backup applications, "are in the driver's seat when it comes to managing data in the backup process," says ESG's Whitehouse. To meet this challenge, expect VTL vendors to aggressively increase their data capacities by using higher capacity disks that are more energy efficient; also look for the universal adoption of deduplication, as well as features such as disk spin-down to reduce power and cooling costs.

Article 9 of 17

Dig Deeper on Storage virtualization

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

Get More Storage

Access to all of our back issues View All