Tape libraries on the SAN: sharing isn't always good


This article can also be found in the Premium Editorial Download "Storage magazine: Are your data storage costs too high?."

Download it now to read this article plus other related content.

Alternatively, you could allocate some number of drives to each of the three storage nodes depending on their current and projected workloads. In this connection scenario, the storage nodes won't have a defined path to all 12 tape drives. Instead, they'll only be able to see those drives allotted to them via fabric zoning.

The shared pool approach will definitely exercise your storage application's ability to recover from device delays because there's no direct way to throttle any one storage node's ability to request tape drives. You could control the number of tape requests issued by the storage node by massaging the schedule. But that assumes some person or group is actively managing the schedule, which in this scenario isn't the case. Therefore, depending on the time that a particular storage node was scheduled for work, that node could potentially request and reserve all 12 tape drives for as long as it needs to complete its work. During that time, the two remaining storage nodes would have to wait until the first storage has completed enough work to release its reservation on a tape drive, possibly causing backup failures.

The allocation approach calls for the pre-assignment of physical tape drives to each of the three storage nodes in whatever combination meets the expectations of the storage nodes. For example, suppose you had two dedicated backup storage nodes or media servers to drive backup data for 100 application hosts.

Requires Free Membership to View

In addition, you had one storage node or media server that doubles as a high-availability Oracle server with a significant amount of data. The server dumps logs to tape on an hourly basis.

Bringing file and block together

In this scenario, you might want to allocate five tape drives to each of the dedicated storage nodes, and two to the storage nodes doubling as an Oracle server. This way, the Oracle server will never have to wait for a tape drive during the busiest times of the other two storage nodes.

That approach requires more emphasis on capacity planning because once you have provisioned your tape drives in the desired configuration, a dedicated storage node that's using all five of its tape drives can't automatically request a sixth, even if there's a tape drive available from another storage node's pool. Of course, you can change your fabric zoning configuration to shift drives from one storage node to the other, but not in real time. I've implemented and managed the results of these approaches, and they work, but only for the right scenarios.

If the operations group or help desk are responsible for managing the backup schedule for a modest amount of application hosts--say, up to 100--then the best thing to do is to utilize some of the money saved from not having your engineering group support this environment to purchase an additional tape drive or two, and go with the shared pool approach, making your tape drives visible to each of your storage nodes or media servers. Ongoing capacity planning won't be necessary, unlike with the allocated approach. And because the application server count isn't an inordinately high number, overscheduling backup jobs during the backup window is less likely. That isn't to say care shouldn't be given to the backup schedule--you still don't want to schedule too many backup streams to any one of the storage nodes or media servers during any one time frame.

For organizations supporting more than 100 application hosts--especially if they're already having problems completing their backups before the end of the backup window--capacity planning should be a high priority. A high number of application hosts, and an increasingly high amount of utilized storage associated with these hosts, are early indicators that capacity planning is justified regarding how you connect the tape drives in your tape library to the backup servers on your SAN.

Often times, just get it done, administrators stuff hundreds of backup clients into the schedule wherever they fit, and assign them to a storage node in a random fashion without any regard for its current load or the resource requirements of its neighboring storage node. In an environment where all of the tape drives are visible to all of the backup servers, an uncontrolled mount storm is likely to occur and cause resource shortages in your tape pool.

However, in an environment where the 12 drives are divided in some combination and specifically provisioned to a server, a similar mount storm won't yield unpredictable results. Each of the servers will only have access to the tape drives that they were provisioned. The same number of mount requests will still be generated, because that depends on the number of simultaneous streams the tape drive can support.

But allocating drives yields some predictability, because you know how much data to backup and how many tape drives you'll have to perform those backups. With a shared pool, you know how much data is going to be backed up, but the amount of tape drives the storage node has available could change from day to day depending on what work has been scheduled for the other storage nodes at the same time.

If you're thinking the allocated approach appears to go against the benefit of sharing tape drives in a SAN, you're right. All installations won't be able to take advantage of the real-time tape drive sharing capabilities of their backup and recovery software. Instead, because of the number of backup clients and the volume of associated data backed up daily, these installations will benefit from the speed, scalability and management enhancements of a SAN-attached tape library. Certainly, there are monetary reasons why you should share tape drives, but if you're supporting hundreds of backup clients in your environment, with individual business units having access to the backup schedule and able to add and remove backup clients to the environment, then you must institute predictability somewhere in your plan.

Robotic arm connection
The robotic arm is controlled and accessed by the master server in your backup or hierarchical storage management (HSM) environment. The storage nodes or data movers make mount requests to the master server and the master server then issues SCSI commands to the library's robotic arm to select and load the requested tape. How you make the robotic arm visible on the SAN depends on the logistics of your data center(s) and your business continuance objectives.

If your SAN design requires your tape library be located within the SCSI distance limitations of your master backup server, then there's no real benefit to having the robotic arm bridged into the SAN with an FC/SCSI bridge. However, there could be a downside to making the robotic arm visible to the master server via the SAN instead of directly attaching it via a SCSI cable. This is especially true if the master server is also doubling as a storage node and is responsible for moving large amounts of data. In this scenario, make sure that the data stream resulting from scheduled backups isn't impeding on the command, data and status information units being directed to and from the robotic arm. This possibility is due to the hardware interrupt management routines involved in streaming data to the tape drive when accounting for the length of time it takes the status of a SCSI mount command to return to the initiator (master server).

However, if your design and business recovery objective requires that the robotic arm be accessed over a distance, consider installing a second FC Host Bus Adapter (HBA) in the master server-possibly on a separate bus-to access the robotic arm. With 1Gb/s HBAs selling on eBay for $200 and prices per switch port declining, this configuration isn't as much of a luxury expense as it use to be.

The ability to access the robotic arm over FC does have its benefits, but these benefits are related to distance with regards to disaster recovery or building logistics. To that end, be sure to certify and maintain firmware revisions within your interconnect devices, as well as any device drivers and SCSI tape patches on the master server.

If you're anticipating deploying an enterprise class tape library onto the SAN, ask lots of questions. Make sure your vendor(s) explain in detail the pros and cons of the design methods used to integrate the hardware into the SAN, emphasizing the cons. Often times, we focus on the benefits of implementing new hardware without looking at the other side of the coin. Having an independent integrator or tenacious employee on your side may help protect your interests by adding a different perspective.

Web Bonus:
Online resources from SearchStorage.com: "Quick Takes: Forever Tape," by Kevin Komiega.

This was first published in December 2002

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: