Virtual tape libraries can speed backups and restores, but VTL products vary widely and the differences are sometimes subtle.
Initially developed for mainframes, virtual tape systems, which emulate tape libraries to a backup server, are becoming an important part of the shift toward disk-based data protection. Virtual tape libraries (VTLs) present disk as tape, so the normal backup media server can perform backups as usual, regardless of the physical backup infrastructure.
There are two major types of virtual tape architectures: bundled appliances that include virtualization software and disk storage, and virtual tape software that runs on a server platform with a variety of back-end storage devices. An appliance provides higher levels of integration and reduces the operational complexity of deployment and administration. A software-only approach offers more flexibility and investment protection because it enables storage administrators to leverage their existing server and storage infrastructure to deploy virtual tape. In general, an appliance may be more appropriate for small- to medium-sized businesses, while a software-only approach fits best with the operational practices of enterprise-sized organizations.
|Software Virtual Tape Libraries|
|Click here for a comprehensive list of software virtual tape libraries (PDF).|
Examples of appliances include Pathlight VX from Advanced Digital Information Corp. (ADIC), Revolution 200T from Copan Systems, Clariion Disk Library from EMC Corp., DX100 from Quantum Corp., Sepaton S2100-ES from Sepaton Inc. and the Spectra RXT from Spectra Logic. Software products include Securitus from Alacritus Software Inc., VTF Open from Diligent Technologies Corp., VirtualTape Library from FalconStor Software and Virtual Storage Engine (VSE2) from Neartek.
A virtual tape solution manages two sets of data elements: source data and control data. Source data refers to the data being backed up, while control data consists of the mapping between virtual and physical resources and policy parameters. The logical or physical separation of data and control paths improves performance and scalability. Storage managers should look for products that can scale data paths independent of control paths.
Interoperability with your hardware and software is one of the most important selling points of virtual tape products. Virtual tape products differ significantly in the breadth of interoperability support they deliver. Interoperability must be measured along several dimensions. The first and most obvious dimension refers to the number of tape formats and libraries a VTL product supports. To minimize the disruption associated with introducing disk-based technologies into the environment, users should look for virtual tape products that can present disk resources identical to their existing tape libraries and tape formats to the backup hosts; the broader the VTL device's support, the greater the operational efficiencies and investment protection. Storage managers should seek out a VTL product that emulates all of their existing devices, as well as any new platforms being considered for acquisition in the future.
|Hardware Virtual Tape Libraries|
|Click here for a comprehensive list of hardware virtual tape libraries (PDF).|
Most enterprises have many heterogeneous hosts consisting of a variety of open-systems flavors and, often, a mixture of open systems and mainframe. Hosts will vary in their connectivity, with mainframes supporting ESCON and FICON, and open systems relying on a combination of Fibre Channel and SCSI. Make sure the VTL supports a variety of heterogeneous hosts and connectivity options, and find out if the VTL will support next-generation SATA disk platforms. The VTL should also support heterogeneous platforms concurrently. Some vendors claim to support heterogeneous platforms, but require the separation of the disparate hosts onto different virtual tape units. This leads to independent data-protection silos that limit the economies of scale and operational leverage in the storage environment. Neartek supports a combination of mainframe and open-systems hosts within a single product. Diligent provides mainframe and open-systems support, but with distinct products. All other VTL vendors are focused on the open-systems market.
Media lifecycle automation
Along with differences in product architecture and interoperability support, virtual tape products vary significantly in their media lifecycle automation capabilities, particularly in the way they handle offsite media management and vaulting. Vendors have developed a number of alternative approaches to managing tape as part of a virtual tape solution. Understanding this aspect of a particular virtual tape product is absolutely critical.
Integrated VTL. In this approach, the VTL sits in front of the physical tape library. This is a completely back-end-driven method, with the virtual tape solution invoking a copy-and-export function that makes a replica of the virtual tape on the physical tape media. This is the simplest approach to tape management, but it can also be the most problematic. The advantage of this approach is that it keeps backup data from traversing the primary network. But because it's a completely back-end-driven process, and bar-code consistency between the physical and virtual tape media isn't always enforced (different bar codes may exist so the backup application catalog isn't confused by two cartridges--one physical and one virtual--with the same bar code), the physical tape copy and offsite vaulting aren't reflected in the backup application's media server catalog.
As a result, if a user needs to recover data from a particular volume on the physical tape, the backup app will be completely unaware of the tape's bar code or offsite location. The data would be nearly unrecoverable. Alacritus and FalconStor are two vendors that don't always enforce bar-code consistency.
Copy/export with bar-code consistency. Some virtual tape products preserve the integrity of the media catalog and the ability to recover data from offsite media. The virtual tape system creates a copy, or export, of the virtual tape cartridge analogous to the approach described earlier, but with one major distinction: The virtual tape control software maintains a unique 1:1 relationship between the bar codes of the virtual and physical tape cartridges. The virtual tape cartridge will always have the same bar code as the physical tape cartridge, and if the physical media is taken offsite, the user can specify the offsite location.
However, a potential problem to this approach is that it puts the "same" cartridge in two different physical locations, which confuses the backup application. The virtual tape solution must therefore keep track of the tape and a disk version of it (if it exists), and preserve consistency between the two, thereby ensuring that the backup application isn't confused.
There are two variations to this approach. In the first, data from the virtual tape cartridge is moved to the physical tape cartridge using the tape export function. Although bar-code consistency is maintained, the backup media server may now incorrectly believe that the tape is no longer in the physical library. As a result, the recovery process for data not on disk involves manual tape reloads. Vendors supporting this type of approach include ADIC, Alacritus, Copan, EMC and FalconStor.
In the second variation, data from the virtual tape cartridge is automatically copied to the physical tape. The virtual tape solution tracks the media bar codes that have been copied and remain in the physical library (whereas in the first variation this media would have been believed to be exported). Because these products view the disk-based "virtual" media and the tape-based "physical" media as one media pool, users can recover data from local tape in an automated manner using a combination of the virtual tape and backup apps. The backup application alone may not be able to recover the data because it may not be aware of the media location. Neartek's VSE2 offers this type of implementation. Users looking to keep backup data off the primary network will find merit in both variations.
Standalone. This method relies on the backup media server to move data from virtual to physical tapes--the virtual tape and physical tape are "side by side" and both are addressed by the backup application. The media server mounts the virtual tapes, and then moves the backup data over the network and onto the physical tape drives. This preserves the backup application's control and knowledge of the physical media, eliminating issues that may arise with the export-and-copy approach.
There are a few subtle variations in product capabilities. In some cases, the backup media server will write the backup data set directly to the physical tape library. This is the case for ADIC, Alacritus, Copan, Diligent, EMC, FalconStor, Quantum, Sepaton and Spectra Logic.
In this scenario, the user incurs licensing fees for both sets of virtual and physical tape libraries (as both are written to by the backup application). Some virtual tape architectures, like Neartek's VSE2, allow the backup stream to be directed to the physical tape media via the virtual tape devices. Thus, the backup application sees only the VTLs as its targets, and users avoid the incremental and often substantial licensing expenses of their physical tape devices.
Tape as tape. This is the ability to emulate one tape format with a different underlying physical tape infrastructure, an important capability that enables seamless interoperability in tape formats between the backup application and the physical tape.
For example, with tape-as-tape emulation, a backup application that couldn't natively support a particular tape format could be "fooled" into writing to the previously unsupported format. Vendors supporting tape as tape are Alacritus, Copan, EMC, FalconStor, Neartek and Spectra Logic.
A careful assessment of your offsite media management requirements must be conducted when evaluating any virtual tape solution. It's important to look for products that preserve media catalog integrity and data recoverability from offsite media while also providing operational flexibility.
Introducing a new tier of storage into the data protection infrastructure will create some operational challenges for an IT department. Administrators will have to determine how many days of data should be retained on disk, how frequently data should be copied to tape from disk, and the number of data copies that should be retained on disk and tape. Once acceptable standards are established, the execution of these plans can be very labor-intensive.
For this reason, any virtual tape offering must provide a comprehensive and flexible policy framework that allows administrators to specify parameters (such as those above) on a volume-specific basis. A policy for vol1, for example, may stipulate that all virtual media on which vol1 is stored should be mirrored and copied to physical tape on a daily basis starting at 2 a.m. Such policies ensure that backup operations are carried out in a reliable and repeatable manner.
In addition, the ability to define dynamic storage or media pools enables the use of automated migration policies for data movement between tape and disk tiers. Other important aspects of a policy framework include the ability to prioritize tasks or backup jobs. This gives backup jobs some level of quality of service and ensures that they can be completed within a specified backup window. Finally, distributed enterprise environments require effective security measures. The capability to create multiple volume groups or domains, each with its own authorization parameters, must be included in virtual tape policies.
Collocation and stacking
The growth in tape cartridge size has created a glut of underutilized tape capacity in IT environments. Backup apps typically write data to tape serially, which leaves significant cartridge capacity unused. For example, Job 1 may be written to Cartridge A and use only 10% of that cartridge's capacity. Job 2 would then be written to Cartridge B, leaving 90% of Cartridge A unused.
By embedding additional layers of intelligence in the tape data-layout process, virtual tape vendors can improve tape utilization. Collocation, or stacking, places unrelated volumes/jobs from a virtual tape drive onto a physical one in a manner that maximizes the utilization of the cartridge. In addition to significant reductions in cartridge count (some users have seen a 10 times reduction, which isn't surprising given that utilization rates tend to be around 15% to 20%), collocation also speeds up restores. With collocation, a volume restore no longer requires the restoration of the entire tape. Products supporting collocation or stacking include ADIC's Pathlight VX, Diligent's VTF Mainframe, EMC's Clariion Disk Library and Neartek's VSE2.
Virtual tape holds tremendous promise. It's a powerful technology that effectively bridges the gap between traditional tape and disk-based data protection architectures as non-disruptively as possible because virtual tape preserves your operational processes. But the wrong virtual tape product can be disastrous for your storage environment, so a thorough evaluation of available products is essential.