| With the proliferation of virtual tape libraries within storage environments, storage managers face an increased number of decisions in regards to managing and cataloging the transfer of data to tape.
| Still, the same principles they introduced to manage disk and tape in mainframe environments are applicable for open-systems VTLs without mainframe costs and restrictions. Some open-systems VTLs include data management software, which manages the copying of data from disk to tape and back again. Two types of software are available to handle this: proprietary management software and third-party backup software.
| Quantum's new DXi7500 uses different management software and is configurable as a VTL, a NAS backup target or both. When used as a NAS backup target, the DXi7500 appears as a disk pool to the backup application; when it copies data from its disk pool to physical tape, it optimizes the amount of data stored on each tape by filling it. This alleviates a current problem when copying data from a virtual to a physical cartridge: If a virtual tape cartridge isn't completely filled, the corresponding physical tape cartridge won't be either.
Using a VTL to manage data copying from disk to tape creates other possible problems (see "Six questions to ask before buying a VTL"). When copying data from disk to tape, the data stored to the physical tape may be in the same format as that stored on disk. In this format, it lacks a tape header and other information needed by the backup software to read the physical tape. This means you have to recover data from the physical tape to the VTL before the software can recover the data. Additionally, some mechanism is also required to copy the VTL's catalog from the production site to a VTL at the disaster recovery site so it can recover the data.
The greater concern with permitting the VTL to manage the data copy from disk to tape is that the VTL needs to connect to the backup software to update the backup software's catalog with the information about the new physical tape. Usually, admins manually update the backup software catalog with the information each time the VTL creates a physical tape copy, although some VTLs can handle the chore; for example, Quantum's DXi7500 has an interface to Symantec Corp.'s Veritas NetBackup 6.5 for updates. If the catalog isn't updated, you can only recover data from physical tape by first using the VTL to read the tape or forcing the software to read each tape and then catalog the data on it.
| Backup software
The need to keep the backup software and VTL catalogs in sync for recovery from tape has led many VTL vendors to leave this responsibility to the backup software. Sepaton Inc. considered incorporating a tape management feature in its VTLs, but has abandoned that for now. "We found that most customers want the backup software to manage the creation and copy of physical tape copies in order to maintain backup catalog consistency," says Jay Livens, Sepaton's director of marketing.
Most VTLs appear only as a disk pool or a tape library to backup software, although some, like EMC Corp.'s and Spectra Logic Corp.'s, support backup software within their VTLs. With the EMC DL6000 Series VTLs, admins may configure multiple nodes to host EMC's NetWorker or Symantec's Veritas NetBackup (but not both), which operate as media managers; admins can install backup software directly onto Spectra Logic's nTier.
Jay Krone, EMC's director of Clariion platform marketing, finds that by placing the backup software inside the disk library the software always knows what's going on with the disk and physical tape. EMC took the additional step of more tightly integrating its EMC DL6000 management software with NetWorker and Veritas NetBackup so admins can manage physical tape creation and movement through the backup software or the DL6000's native VTL management interface. "In this configuration, no matter which way physical tapes are cloned and ejected, the backup software catalog always knows where they are," says Krone.
| Spectra Logic took a different approach to support backup software on its nTier family. Because nTier runs on Windows Storage Server 2003, admins may install any backup software that runs on the Windows OS on an nTier Series disk library. Admins may configure the backup software to back up data directly to the disk cache on the nTier and then copy or move the data to any disk or tape library that's external to the nTier using the installed software.
Here again problems can surface. Because the backup software server must handle data movement from the VTL to the tape software and back again, the backup software needs to insert itself into the data path. Server performance issues can emerge as the amount of backed up data increases or as data is moved from disk to physical tape, which can impact backup and restore windows. Although backup software media servers external to the VTLs can be upgraded to meet those requirements, this task becomes complicated when upgrading backup software media servers that reside within VTLs.
Leaving the management of backed up data on disk entirely up to backup software doesn't completely alleviate other problems that may arise after using disk over time. Dave Kenyon, Sun's VP of storage marketing, finds that backup software does a "lousy job" of managing disk in VTLs because it provides no method to defragment disks or control access to data on a VTL. But he recognizes that using disk as a primary means for recovery with backup software is becoming a prerequisite for firms. "Companies are really screwing themselves if they use tape as their primary means of recovery," he says.
| Virtual tape director
Fujitsu Siemens Computers' CentricStor and Gresham Enterprise Storage Solutions' Clareti VTL are virtual tape directors that are a subset of the broader class of VTLs. Like other VTLs, they virtualize and present disk as virtual tape cartridges, but they also virtualize external physical tape libraries and even other VTLs. Residing in the backup data path, virtual tape directors aggregate a firm's physical and virtual tape resources to present a single backup target or mount point to the backup software (see "Consolidating VTLs," below).
| During backups, a virtual tape director behaves like a VTL storing data to its local disk cache. Once cached, however, the data is copied to the appropriate VTL or tape library based on policies set in the backup software. Because the virtual tape director appears like the physical tape library to the software, the virtual tape director can respond to tape library commands issued by the backup software and copy data from disk to tape. This allows the backup software to offload the performance overhead associated with the data movement to the virtual tape director while keeping the backup software's catalog up to date with the creation of physical tape.
Because they virtualize physical tape libraries, virtual tape directors such as the Gresham Clareti VTL may also integrate with physical tape libraries and facilitate faster data recoveries. When the backup software requests data from the Clareti VTL, it will pull the data directly from the disk if it still resides on its disk cache. If the requested data is no longer on disk, the Clareti VTL's integration with tape libraries allows it to recall data from tape faster than using backup software.
When backup software requests data directly from a physical tape library, the software sends the library the information about where the data is positioned on the tape. Because most backup software doesn't know how the tape media in the cartridge is physically positioned, the tape drive must rewind the tape to the beginning of the cartridge before it can start to look for the data. However, because the Clareti VTL maintains information about the tape's position in the tape cartridge in its own catalog, it can immediately go to the position on the tape where the data is located without rewinding it.
| Dealing with deduplication
New VTL?features dramatically increase the amount of data that can be stored on disk, but they add to the complexity of copying data from disk to tape. The compression algorithm in the VTL may not be the same as the one used by the target tape drive. This forces an admin to do one of three things when copying data to tape:
Deduplication on VTLs creates similar issues. Because tape drives don't natively support deduplication, a VTL with deduplicated data must first reconstruct the data in its native format before sending it to tape. This requires reserving sufficient time and ensuring that the VTL's performance is sufficient to reconstruct the deduplicated data before copying it off to tape. Technically, the deduplicated data can be copied to tape, but that reintroduces the dependency on the VTL for recoveries.
| Some vendors are providing workarounds to these problems. The simplest method might be found in Copan Systems Inc.'s Revolution 300T/TX, which stores the most recent backup in native backup format with no compression or deduplication. While the Revolution 300T/TX supports compression and deduplication, it performs these functions after the backup is complete or post-backup at a time scheduled by the admin. This avoids the need to reconstruct the data in its native format when copying it to tape, although firms will need sufficient storage on the Revolution 300T/TX to keep an entire backup of all of their data in native format.
Most firms aren't encountering problems with encryption when copying data from disk to tape because encryption is primarily used just prior to moving data offsite. In that scenario, either the backup software or the tape drive encrypts the data just as it's stored to tape. While most VTLs offer encryption as an option, "A practical use case for encrypting data on a widespread basis in the VTL has not yet been made," says EMC's Krone.
As disk assumes a larger role in backup, tape remains a part of most data protection operations. While some VTL vendors have taken measures to incorporate tape management into their product, in the near term, give priority to products that integrate backup software with their VTLs such as EMC's DL6000 for the large enterprise, and Quantum's DXi7500 and Spectra Logic's nTier for SMBs. But the recent emergence of virtual tape directors like Gresham Enterprise's Clareti VTL and Fujitsu's CentricStor offer a compelling alternative as virtual tape directors let you introduce disk into their backup, use their existing physical tape libraries and let their backup software manage it all.