This article can also be found in the Premium Editorial Download "Storage magazine: A look inside Hitachi's TagmaStor high-end arrays."
Download it now to read this article plus other related content.
|Checklist: planning for disk backup|
Integrating disk into backup
There are several approaches to implementing disk-based backup (see "The disk-based backup landscape"). Selecting the right approach for your environment is a matter of determining the right balance of functionality, complexity and cost (see "Checklist: planning for disk backup").
Just as there are many benefits in a tiered data protection and archive storage strategy, getting there requires serious planning in the following areas:
Introduction of additional complexity. In most of today's backup environments, restores are simply a matter of locating the tape with the required data. As long as the backup was successfully completed, the data can be located on tape. Depending on the disk-based approach used, the data may be on disk for minutes, hours, days, weeks or even months. This introduces the need to ensure that data is managed and migrated properly. Some backup products have been enhanced to help perform this function, but in practice this represents a new set of daily operational and management tasks that must be performed and monitored. Imagine discovering at the last moment that a nightly backup to a disk cache never migrated properly to tape, and the next night's backup cycle was about to begin! Using disk also requires more disciplined planning--especially if the disk is shared among multiple media servers.
How your backup software handles disk will greatly affect your disk-based backup implementation. Backup vendors have optimized their applications to utilize tape effectively and many base their licensing fees on criteria such as robotic tape libraries and the number of tape drives (see "Does your backup software do disk?"). Disk-based backup disrupts that model to some degree. Key considerations include:
- Additional software licenses or add-on components required to support disk
- Restrictions or limitations imposed by the application on the use of disk
- If the application manages disk space automatically; the type of increments it writes to disk; capabilities for extending capacity when file limits are reached or when recovering and reallocating space as data expires
- Additional architectural and procedural changes required to fully realize the benefits of disk
Optimization. It's rare to find a data center that's 100% satisfied with its backup/restore environment. A new disk-based backup environment that emerges from a poorly functioning tape-based backup environment is likely to inherit those same inefficiencies. From a performance standpoint, think of a car that can't exceed 60 mph when the speed limit is 70. Putting that same car on the German Autobahn with no speed limits won't make the car go faster. Consider the processing overhead of doing incremental backups of a system that has millions of small files. The time for the backup software to scan those online files and determine what has changed will remain constant, regardless of whether the backup is going to disk or tape. The first step toward better performance is to find the bottlenecks in your backup environment and ensure that the backup architecture can support driving faster disk or tape.
Scheduling and policy. After adding disk to your backup environment, you may need to make some modifications to your current backup schedule. If performance gains are notable and the backup window becomes less of a concern, you have an opportunity to modify the current policy to potentially do full backups more often, or perform differential backups instead of incrementals.
Conversely, you may determine that better resource utilization and backup performance can be attained by doing fewer full backups and more incrementals. However, this would adversely impact recoverability time from cloned off-site tapes. To avoid this, you may want to perform synthetic fulls, where the backup software automatically (on a predefined schedule) generates a new full backup tape from a series of (disk-based) full and incremental (level zero, level one) backups.
| The disk-based backup landscape|
When integrating disk-based backup, the critical decision is what approach--or combination of approaches--is appropriate for a given environment. Here's a list of the strengths and weaknesses of the various techniques.
This was first published in September 2004