This article can also be found in the Premium Editorial Download "Storage magazine: How to plan for a disaster before a software upgrade."
Download it now to read this article plus other related content.
The following tips will help lessen your big file backup problems.|
It's a long-standing problem: As data piles up on a server, completing a successful backup becomes harder. Backup apps become bogged down with millions of files to examine, and network and CPU limits can stall throughput when transferring a gigantic file. Even if a backup job is successful, the data in a large file may have changed in the hours it took to create the backup image. Vendors and users are now applying new ideas and technologies to ensure that no data set is too big to back up.
The rapid creation and accumulation of stored data has pushed traditional backup approaches to their breaking point. Large amounts of storage capacity and advances in processing power have led users to believe that a virtually unlimited amount of data can be stored and protected, but most backup managers will admit that it's just not so. While tape drives have become larger and faster, and new technologies like LAN-free and disk-based backup have reduced the load, the old approach of "scan everything every night and back up what has changed" is failing.
Storage devices are limited by their interfaces. A fast disk drive or gigabit Ethernet network can transfer only a few dozen megabytes of data per second, and most are far slower. At that speed, copying the entire contents of a 300GB disk drive takes a few
| hours at best, even if no other factors are involved.
Backup systems mitigate this problem in a number of ways. Most examine the contents of the drive and copy only what has changed since the last backup, and this incremental backup approach can greatly reduce backup times. Multiple backup processes/jobs can also be run at the same time, taking advantage of servers and disks that have to be backed up every night with their own interfaces. If that isn't enough, extra network connections, backup servers and tape drives can be added. These approaches have traditionally kept the world of backup afloat, but things are changing.
Tom Woods, backup supervisor at Ford Motor Co., was faced with a monumental backup task that challenged traditional approaches. "We had one system with 18 million files and we had to back it up every four hours," says Woods, "but just scanning that many files with TSM [IBM Corp. Tivoli Storage Manager] took five hours." Another system had massive database files that took hours to stream to tape, and applications had to be quiesced to keep changing data from ruining the usefulness of the backup copy. Finally, Woods had to consider whether his backup copies could be restored in a timely fashion. "The NDMP [Network Data Management Protocol] method we tried with our NAS servers was reliable, but restoring data at 200GB per hour meant it would take four or five days to recover," he explains.
This was first published in May 2008