This article can also be found in the Premium Editorial Download "Storage magazine: Expanding SANs: How to scale today's storage networks."
Download it now to read this article plus other related content.
One of the major benefits commonly touted for implementing a storage network is improved utilization. With proper planning and cooperation, you can forecast storage needs for a given budget cycle and consolidate storage purchases, resulting in considerable savings. However, most managers neglect the backup environment, particularly the tape infrastructure. This leads to a flurry of panicked activity, often resulting in a hurried purchase to address the immediate need. Repeating this numerous times results in a chaotic backup infrastructure.
|Forecasting your tape drive needs|
Once you model your current tape environment and measure your actual data growth needs, you can forecast when your current environment will run out of steam. In this real example, the environment was basically already maxed out. By consolidating backup servers on a common platform, the tape infrastructure could be better used, resulting in extended life before it hit the capacity ceiling.
To develop a model for capacity planning, there are two dimensions of backup capacity that need to be examined and understood--one related to performing daily backups and the other pertaining to retention and recall of data. The former is driven by the requirement that sufficient resources are available to meet daily demand. Think of that as a bandwidth consideration. The latter--relating to retention and recall--focuses on the availability of backup data for timely recovery, and is a policy and media capacity consideration. Let's look at the components of each of these in a traditional LAN-based tape-centric backup environment.
Bandwidth Generally, bandwidth capacity for backups pertains to how much data must be moved within the backup period, and is based on a number of components along the data path. If we look at a traditional network-based backup, the bandwidth capacity is the slowest of the following:
- The primary disk from which data is read
- The backup client placing data on the network
- The network transporting backup data to the server
- The backup server reading, processing and writing data to the tape storage system
- The aggregate tape write performance of the tape storage system
- Network bandwidth = number of backup channels x network throughput
- Server bandwidth = number of media servers x I/O throughput capacity
- Tape drive bandwidth = number of tape drives x tape write rate per drive
- Backup window (in hours)
Next, determine the current utilization rate. This entails measuring the current daily backup volume, both average and peak, and determining the percent of capacity in use. In many environments, a much higher volume of data is backed up on weekends or other periods of high activity. Be sure to factor this into your calculations.
Once this baseline is established, the next step is to estimate the rate of growth. This should include both the data requirements of current applications as well as those being planned in the future. Also, be sure to include other factors such as consolidation of multiple backup environments. Obviously, the accuracy of this forecast is critical to the validity of the capacity planning exercise. So I strongly recommend that these numbers be measured against actual data periodically and the forecast adjusted accordingly (see "Forecasting your tape drive needs").
This was first published in November 2003