This article can also be found in the Premium Editorial Download "Storage magazine: Adding low-cost tiers to conserve storage costs."
Download it now to read this article plus other related content.
|NDMP Speeds backup traffic|
|The Network Data Management Protocol (NDMP) began|
| in 1996 as an initiative to create an open standard for network-attached storage (NAS) backup. Pioneered by Intelliguard and Network Appliance, NDMP is now supported by most major backup software and NAS hardware vendors, as well as operating system providers. In April 2000, control of the specification moved to a working group operating under the auspices of the Storage Networking Industry Association (SNIA). The protocol standard is currently in v.3, with v.4 being finalized by the NDMP Working Group.
NDMP defines a way for heterogeneous file servers on a network to be backed up. As with many initiatives, a stated goal of the organization's efforts has been to create a standard approach to network backup that reduces the amount of effort backup software vendors have to expend to add support for the list of NAS platforms and operating system releases. This standards-based approach is designed to ensure the availability of backup-ready products from a variety of vendors who only need to make a minimal investment.
According to NDMP.org, the protocol allows the creation of a common agent used by the central backup application to back up different file servers running different platforms and platform versions. An NDMP server essentially provides two services: a data server, which either reads from disk and produces an NDMP data stream (in a specified format) or reads an NDMP data stream and writes to disk, depending upon whether a backup or restore is taking place and a tape server, which either reads an NDMP data stream and writes it to tape, or reads from tape and writes an NDMP data stream, depending upon whether a backup or restore is taking place. Tape-handling functions such as split-image issues are managed by the tape service.
So how does NDMP shorten the backup window? NDMP minimizes network congestion be separating the data path and control path. Backups occur locally on file servers direct to Fibre Channel/SCSI connected tape drives while management is centralized. NDMP, as a standard protocol, is promoted and supported by server vendors, backup software vendors and backup device vendors. A variety of products have sprung up which utilize the protocol--backup software, messaging appliances, tape products and products supporting NAS filers. Visit www.ndmp.org for more information, or view a list of compliant products at http://www.ndmp.org/products/index.shtml#backup.
The most common ways to reduce backups include:
- Reducing the amount of data selected for backup
- Reducing the number of full backups
- Using hierarchical storage management (HSM)-like tools to migrate data, primary file systems and data stores
Exclude lists can be maintained globally on the backup server or locally on each client. Excluding files can be tricky because all systems may not use the same naming or usage conventions. For example, some administrators may store files of value in a temp directory.
If exclude lists don't appear to be a reliable option, the amount of data backed up each night can be reduced by performing full backups less frequently. A common approach is to perform a full backup once a week and incremental backups the other six days of the week. This means that every week large numbers of files are backed up over and over again, even though they haven't been touched since the last full backup.
Most configurations include a full backup at least once a week to reduce the amount of incremental tape volumes that will be needed to be read from to perform a restore. If full backups are performed only once a month, it's possible that up to 29 incremental backups would need to be restored to retrieve the full file system or directory. Traditional tape technologies--and loading and reading from a large number of tapes--are typically slow.
New technologies, such as backup to disk and synthetic fulls, are making full backups a less-frequent requirement. Backup to disk eliminates the overhead associated with the loading and tape seeking encountered with each individual incremental backup. Synthetic fulls create a full backup-like dataset by moving files from the last full backup and subsequent incrementals on various tapes to a single data set that typically spans only a single tape volume (the number of volumes equals the dataset size divided by tape volume capacity). Both backup to disk and synthetic fulls eliminate the need to load and read from individual incremental tape volumes, minimizing the restore time.
A third method of reducing the amount of data being backed up is to deploy HSM technology. HSM is a policy-based data migration tool that moves infrequently accessed files to a different storage target. In addition to reducing the amount of primary storage required, HSM also reduces the amount of storage backed up during full backups because only a file stub remains on the primary file system when a file is migrated.
In many organizations, much of the data backed up during each full backup includes files that haven't changed in months or years. But HSM can present its own set of challenges. Specifically, the impact of an HSM solution on the various applications that might be affected needs to be understood. Also, there should be an awareness of how the HSM and backup applications interact.
Reducing the bottleneck
After reducing the amount of data being backed up through file exclusions, less-frequent full backups or HSM technologies, it's time to look at moving the data from the source client to the backup target more quickly. This effort entails finding and eliminating bottlenecks along the data path. Common bottlenecks in the data path include:
- Client resources such as disk drives, CPU cycles, memory, network interface and file system attributes
- Network resources such as the IP LAN or ISLs in the Fibre Channel (FC) SAN fabric
- Backup server resources, CPU cycles, memory, network interface and tape/disk backup target devices
Data can only be backed up as fast as the client can source the data. Running a backup creates a substantial load on the backup client because reads of nearly every file in the file system are required. Common client system components such as disk drives, memory, CPU cycles and network interface are all taxed when a backup runs.
Another bottleneck exists with the physical storage. It's not uncommon to have the same group of spindles accessed from two different servers or two file systems from the same server. Depending on the disk configuration, simultaneous backups of different clients or file systems could create disk contention, limiting the client's ability to read the data. Rescheduling backups that share common spindles to run at different times is also a solution.
Another bottleneck may occur in organizations where the majority of clients still backup over an IP LAN. The processing overhead associated with pushing a large-sustained amount of data over an IP network interface card (NIC) can tax client CPUs. The CPU load created by the IP processing overhead is frequently associated with iSCSI performance issues, but also can impact backup performance. New backup architectures such as SAN-based backups send the data files over shared FC SAN-attached devices using a more efficient protocol optimized for the large sustained data movement associated with backups. SAN-based backups also reduce the overall load on the company LAN because the data files are copied to the backup devices using the FC SAN.
File system characteristics also can be a source of slow backup performance. File systems with millions of small files (which is becoming more common) usually back up more slowly because of the overhead associated with recording the metadata for each file on the backup server and the time it takes for the file system to look for changed files.
Typically, the overhead of recording metadata is negligible because the ratio of data to new files is very high. However, in systems with large numbers of small files, this ratio reverses and the overhead impacts overall performance. File systems also can cause bottlenecks during incremental backups, when the backup client needs to check the file system to identify which files have changed.
This was first published in August 2004