This article can also be found in the Premium Editorial Download "Storage magazine: Five companies on their storage virtualization projects."
Download it now to read this article plus other related content.
The different virtual tape library (VTL) architectures offer various ways to fine-tune performance by taking advantage of the unique characteristics disk has to offer. For instance, EMC Corp.'s Clariion Disk Library allows users to increase performance by using its write-cache consolidation feature, which consolidates blocks of data in backup streams into 1MB blocks and then writes the blocks directly to disk. This allows the data to be laid down sequentially rather than randomly, which degrades performance on SATA drives.
Sepaton Inc.'s S2100-ES2 employs two different technologies to optimize performance on its VTL. First, Sepaton allows users to group
| shelves of disk drives into pools that are written to and read by the scalable replication engines (SREs) that host its VTL software. Next, the SREs break up incoming backup jobs into 32MB chunks called extents and sequentially writes one extent to each shelf in the pool.
This approach provides two performance benefits. First, by distributing data across all disks in the pool using extents, sequential read-and-write performance is not impacted. The 32MB size of the extents ensures that random reads or writes are spread across enough disk drives on different shelves that performance doesn't suffer. The other performance benefit shows up if the throughput of the existing SREs is reached. Because the S2100-ES2 supports up to nine SREs, another SRE can usually be added to the S2100-ES2 (unless it's already fully populated) without the need to introduce a new VTL into the equation.
Diligent Technologies Corp.'s VTL server appliance approach gives users the option to place backed up data on Fibre Channel (FC) disk drives, not just SATA drives. The company finds that, on average, one of its nodes can achieve about 200MB/sec when connected to back-end SATA drives; if that same node uses FC drives on a high-end array, performance can climb to as high at 350MB/sec. However, Diligent generally sees performance increases of 10% to 20% when users back up to FC drives instead of SATA drives.
Once data is stored on a VTL, copying or moving the data from the VTL to other media for long-term archiving or offsite data protection becomes an issue. There are three basic ways to move or replicate data from a VTL:
- Use the VTL to manage the movement of data between disk and tape
- Use backup software to move VTL-based data to tape
- Replicate data to an offsite VTL
This is one of the reasons why it makes sense to use tape library-based architectures. Products such as ADIC's Pathlight VX 450 and VX 650 create real tapes for export, but only under the control and direction of the backup software. In this way, the backup software catalog remains consistent and data transfers between disk and tape occur without introducing SAN traffic or overhead on the backup server to perform the copies from virtual tape to real tape. Vendors of the other VTL architectures generally recommend letting the backup software manage and move the data between disk and tape. However, using this approach creates a performance hit on both the backup server and the SAN, and should be scheduled during periods of low backup activity to minimize impact.
To eliminate additional overhead to the backup software server, EMC introduced a new feature on its Clariion DL700 series that lets users address this specific data management problem. By including an optional storage node that contains a version of EMC's NetWorker backup software, the node handles the processing of the movement of virtual tapes to physical tapes while sending updates to NetWorker's master backup software catalog.
Sending updates from the storage node to the master catalog allows the master catalog to maintain its consistency. As the storage node moves data back and forth between virtual and real tape, the node updates the catalog on the master backup server with the location of the tapes. Although currently only available on NetWorker, EMC plans to offer storage nodes that support other backup software products and to extend this feature to its Clariion DL200 line of VTL products. Other VTL hardware appliance vendors like NetApp and Sepaton also intimated that they plan to announce similar functionality in the near future.
The final option for moving and archiving data offsite is to simply install a second VTL and replicate data between the two, taking tape out of the equation altogether. Supported by VTL products from Copan Systems, Dynamic Solutions International, EMC, NetApp and Sun, among others, users can asynchronously copy or move data between VTLs at two sites and nearly eliminate the need for real tape. However, this approach doesn't scale easily and requires significant network bandwidth. You should employ this approach only when limited amounts of data need to be moved or copied.
VTL vendors are implementing a host of features to make their VTLs look and act more like real tape libraries. But only VTLs that deliver the benefits of real tape libraries--infinite capacity and data portability--should be considered enterprise ready. For now, only tape library-based architectures from ADIC, Spectra Logic and EMC's Clariion DL700 line with its storage node option appear to meet those requirements.
This was first published in August 2006