This article can also be found in the Premium Editorial Download "Storage magazine: Five cutting-edge storage technologies."
Download it now to read this article plus other related content.
|Data and backup device mapping matrix|
Backup and restore times can be greatly affected by the number and size of the files you are backing up. Millions of small files pose a serious challenge for traditional tape backup products. When written to tape using typical backup products, a large number of small files can significantly impair the performance of even the fastest drives. It's not uncommon to see backup and restore speeds of 50KB/sec to 100KB/sec on tape drives rated to run at 15MB/sec to 30MB/sec. This dramatically increases backup and restore times and at the same time decreases the lifespan of the tape drives and media.
If you are aware of the fact that these kinds of small data files exist in your environment it will help in your preparations for successfully recovering your data. If you want to make the best use of your primary disk and tape backup destinations, a good first step is to classify data based on its characteristics (file size and file volume) and volatility because it affects incremental or cumulative incremental data movement. As a rule, disk subsystems provide optimal performance for large numbers of small files, while tape works best for small numbers of large files. The reason for this boils down to random vs. sequential access to data on the respective device types.
Classifying backup clients into groups based on data characteristics creates a logical basis for segregating backup workloads among different types of target storage devices (see "Data and backup device mapping matrix").
Creating a general matrix for mapping client types to device types is one practical way for backup administrators to optimize utilization of tape devices for backup and restore operations, in cases when disk is already part of the picture. Because every backup environment is unique in terms of its data and its hardware and software infrastructure, coming up with an ideal backup data classification system requires extensive planning, measurement and adjustment along the way.
This was first published in October 2004