This article can also be found in the Premium Editorial Download "Storage magazine: Are your data storage costs too high?."
Download it now to read this article plus other related content.
If you've been evaluating enterprise class tape libraries, you've no doubt been inundated with product glossies from different vendors touting the many connectivity options available when connecting their particular library into your storage area network (SAN). Some of those options amount to different strokes for different folks, but you can untangle the various connection options for your SAN-enabled tape library and make an informed decision.
Data center logistics
The first requirement you need to understand is distance. Where will your tape library be in relation to the backup servers and hosts it will backup?
Due to the initial and expandable size of most enterprise class libraries, invest some time determining the optimal location of the incoming library. Not only do you need to account for its initial dimensions, consider the time frame in which your overall data growth would require you to add an extension cabinet to your initial tape library investment (see "
|Vendors providing native Fibre Channel (FC) tape drives often provide both A and B ports for connectivity into separate storage area networks (SANs) for redundancy. Although this functionality exists, the ability to exploit it isn't readily available in the field. Perhaps in the near future, as tape drive speeds increase--making them more viable solutions for high profile hierarchical storage management applications--vendors in the volume management space will enhance their applications to allow a physical tape drive to be referenced over an alternate path once the primary path has failed.|
Locating an expandable enterprise class tape library in your data center that's already congested with equipment could require you to relocate the entire backup and recovery environment or move production application servers to accommodate expansion cabinets. Imagine walking into an application owners' office and suggesting they bring down their production servers and storage to move them to a smaller section of the data center so you can move your behemoth tape library. Not pretty.
Before Fibre Channel (FC), that wasn't likely to happen because smaller libraries had to be close to backup servers due to SCSI's distance limitations. We were likely to look at the backup and recovery hardware as a complete environment, instead of single pieces of equipment to be managed separately. However, since the allowable distances inherent in FC's architecture are greater than SCSI, IT organizations have been tempted to locate their tape libraries away from their backup servers simply because they could. It was easier to just plop the library down in the next available section of raised floor.
So if your design calls for the tape library and backup servers to be in the same data center, try to locate your tape library adjacent to your backup servers with enough allotted space to accommodate the total projected amount of space necessary for cabinet expansion. That will save you in both the downtime and support cost often times associated with moving equipment.
However, if a SAN design requires that the backup servers be located in data center along with the application hosts that it's protecting, and the tape library is located in data center B for maximum protection, you'll want to ensure that the allotted space in data center B will support additional cabinet space as well.
This second scenario is a scenario many IT groups with stringent business continuance objectives implement for campus environments. The backup data is located in a building other than the application hosts, protecting it from disasters. By placing the backup server in the same location as the application hosts, the hosts are able to deliver their data to the backup server and ultimately to the FC-attached tape library faster than if the backup server was located a few more IP hops away at data center B.
Tape drive connection
The criticality of your production applications drives some decisions on connecting the tape library to the SAN, as well as calculating how many servers will be necessary to drive data to and from tape. For example, if you're backing up infrequently updated data, then it may not be the end of the world if the storage node--to use Legato terminology--responsible for driving the data to the library fails during the backup window.
At the other end of the spectrum, you may be supporting thousands of updates a minute. In that case, the failure of a storage node-or media server in Veritas speak, data mover in EMC lingo-in conjunction with data loss on the associated application host would be more costly. Having more than one storage node provides the application host with more protection, and most backup and recovery software vendors support the ability to define more than one storage node in case of a failure.
Now that you've decided whether one or multiple storage nodes are required, you can start to answer some of your connectivity questions.
Ask yourself what your organization's attitude has been toward managing the storage application where the library is intended. Be honest about your staff's attitude and ability. For instance, if there aren't any procedures or practices in place for adding servers to the backup schedule, then good capacity planning is likely nonexistent. The skill and effort displayed by your staff plays a role in deciding how to provision the tape drives in your library. Here's why.
There are basically two approaches to mapping available drives to servers: a shared pool or an allocated pool (see "SAN tape libraries," this page). If your tape library includes 12 tape drives that must be available to three storage nodes, you could make all of the tape drives a physically shared pool available to all of the storage nodes yielding 36 logical instances of the 12 physical drives. That means your master backup server has to manage who gets the physical drives without assigning the same drive to different storage nodes.
This was first published in December 2002