Backups simply take too much time. Today's 24/7 companies can't take their networks off line to accommodate burgeoning backup windows. Instead, busy storage administrators are employing fast disk-based "snapshot" systems to periodically capture the state of mission-critical systems. Snapshots can establish recovery points in just a small fraction of the time needed for traditional backups and can significantly reduce recovery point objective (RPO) by supporting more frequent recovery points. When properly implemented, snapshot data can be used to restore lost files or recover from more substantial data loss.
While snapshots offer some compelling advantages, the technology poses some unique challenges for the enterprise. This article provides an overview of snapshot technology, and its role in the enterprise highlights the leading vendors in the marketplace and offers some advice to help ease purchase and implementation issues.
Understanding snapshot technology
Snapshot technology provides a fast and efficient means of capturing a system state, typically allowing far more granular restore points than traditional backup techniques like tape. For example, a daily incremental backup only captures data changes once each day. If a system fault occurs, up to an entire day's data changes may be lost -- in addition to hours of production time lost during the restoration process. Snapshots can be captured to disk far more frequently, supporting RPOs as low as several times each hour. If a file is lost or corrupted, it can typically be restored from the latest snapshot data in just a matter of moments.
Full copies vs.s pointers
There are two means of handling snapshots. A full copy creates a complete duplicate of system data to a reserved storage location, while a pointer snapshot, sometimes called copy-on-write or COW, simply records an index of data locations. Full-copy snapshots double the disk storage requirements for your data. For example, if a utility like an EMC Corp. TimeFinder performs a full-copy snapshot of 20 GB of corporate data, you'll need an additional 20 GB (total of 40 GB) to hold the full-copy snapshot. Consequently, full-copy snapshots take the longest to produce, and the high utilization of storage space allows for the fewest snapshots. But full-copy snapshots can be utilized by other applications in the enterprise, such as business intelligence applications. Lost or corrupted data can also be restored directly from the full copy.
A faster and more popular approach to snapshots is COW, also called "pointer snapshots." A pointer snapshot does not copy data. Instead, it simply records an index of data locations on the volume being protected. This requires far less storage space then a full copy. In most cases, the storage space allocated for a snapshot volume is only about 30% of the data volume being protected. "There's nothing in it [the snapshot volume] other than some system-oriented metadata that the system needs to manage this whole environment," says Arun Taneja, consulting analyst and founder of the Taneja Group. For example, it would take up to about 30 GB to protect 100 GB of primary storage, bringing the total storage needs to 130 GB. Low storage demands also mean that pointer snapshots are quite fast, so the production network is not interrupted for more than just a few seconds. This combination of low storage requirements and fast performance make pointer snapshots ideally suited for frequent use, allowing for low RPO.
As you might expect, simply retaining a list of data locations is useless if data changes after the list is made, so changed data must also be accounted for. When a change is made to data (e.g., the Oracle database is updated), any changed blocks are saved to the snapshot volume along with the next subsequent index of pointers. If a previous snapshot must be restored, the older/changed blocks will be taken from the snapshot volume and restored to the data volume using the index of pointers as a roadmap -- this is how a snapshot system is used to restore data.
Recognizing potential weaknesses
One of the biggest concerns with snapshot technology is that the snapshot volume typically resides on the same storage system as the data volume. This is because most storage array vendors natively support snapshot functionality on their own systems. As an example, an EMC Symmetrix will often hold the snapshots created from its own local data. Obviously, this presents a certain amount of risk for storage administrators. "Snapshot technology is great as long as that array doesn't die on you," Taneja says.
According to Taneja, there are two means of mitigating that risk. One tactic is to store the snapshot volume on a different storage system. The data volume may reside on a Symmetrix, for example, but the snapshot might be hosted on a Clariion -- though only a small minority of snapshot users choose that option. The more common tactic is to back up the snapshot volume, which can be accomplished nondisruptively.
Vendors and product selection
Snapshot functionality is extremely commonplace, and can be implemented using virtually any storage array. "Any decent storage box will have snapshot software available," Taneja says. "The majority of the time you're buying the snapshot technology along with the array." Before shopping for snapshot products, check with your current storage system vendors; chances are that snapshot software is already available for free (or available at a reasonable cost).
Analysts point to IBM, EMC, Hitachi Data Systems Inc., Hewlett-Packard Co. and Network Appliance Inc. as some of the largest storage system vendors to include snapshot functionality. StoreAge Ltd. supports pointerbased snapshots, but its product is geared toward heterogeneous storage environments. Other vendors supporting snapshots include BlueArc Corp., OnStor Inc., Engenio Information Technologies Inc., Xiotech Corp. and EqualLogic Inc.
The real question for snapshot tools is to understand the functionality and interoperability of the individual product. For instance, a product like EMC's SnapView software can create pointer-based snapshots as well as full-volume clones of your data. In terms of compatibility, a shop with an EMC Clariion may choose to deploy snapshot software, such as SnapView, because it is directly compatible with the hardware, while a multi-vendor shop might prefer to deploy a snapshot tool like multiMirror from StoreAge that is compatible with its heterogeneous storage environment. It's a matter of preference and meeting the unique business needs of your organization.
Selecting the right product
Product selection is typically quite easy since most storage system vendors include snapshot software with their products -- it's just a matter of installing and configuring the software. However, if you are in a position to acquire snapshot software, analysts suggest using the following list, which can help you identify the best product for your own production environment.
Consider pointer vs. full snapshots. Although many snapshot tools simply produce an index of pointers, some tools can generate a full duplicate of your data. Pointer-based snapshots are extremely fast and require only about 30% of the disk space being protected. Unfortunately, you still need to back up the current data set along with the snapshot index. "The reason to understand that difference is so that you're not surprised when you have a disaster and you can't recover because all you have are the pointers that were protected," says Greg Schulz, founder and senior analyst at StorageIO Group. By comparison, full-copy snapshots require the same amount of storage as the data being protected, doubling storage requirements, but produces a complete duplicate of the data that can be restored directly or used in other business applications.
Determine the snapshot frequency and storage space. Snapshot platforms are typically limited by the storage space that can be reserved for your snapshot volume, so storage limitations can have a profound impact on the number of snapshots that your system can handle. More frequent snapshots will generally require a larger snapshot volume -- requiring a bigger commitment of available storage space on the array. When storage space is tight, you might be limited to only several snapshots per day. A full-copy snapshot doubles storage demands for a single snapshot. For example, taking a single 20 terabyte (TB) snapshot would require a storage array with a minimum of 40 TB of space. Therefore, try to balance snapshot frequency with storage costs.
Consider support from key applications. Snapshots are often used to protect specific enterprise applications, so look for integration with key applications like SQL, Exchange, Oracle and others. Solid integration allows snapshots to be handled directly within the application itself, and snapshot data can be restored directly to the application. This can be more efficient than trying to recover data to a specific folder or drive location.
Determine the backup options for your snapshots. Snapshots are not backups and do not alleviate the need to perform backups of the original data or the snapshot volume. This is particularly important because snapshot volumes are often stored on the same array as the data being protected. Fortunately, a snapshot volume can be backed up to a virtual tape library or tape, or another storage array without interfering with normal network operation. Examine the available backup options and choose a snapshot platform that is most compatible with your backup system.
Best practices for implementation
Since snapshot tools are frequently available as a feature of a storage array, the storage array vendor can usually help you to follow the best practices and configuration options that are most appropriate for your particular organization. However, analysts offer some general policies that can help you get the most from any snapshot tool.
Configure the snapshots to match your RPO. Each snapshot represents a recovery point, so configure a snapshot frequency to provide the desired RPO. "If you can't afford to loose more than one hour's worth of data, you should be taking snapshots at one-hour intervals," Taneja says. Frequent snapshots reduce the RPO, and a snapshot can be recovered in just a few minutes. Remember that storage space and functional limitations may restrict the number of snapshots that you can capture over an hour or a day.
Be selective about what data to snapshot. Although it's certainly possible to protect a variety of applications through periodic ubiquitous snapshots, an increasing number of snapshot users are tailoring their snapshot use to meet the unique needs of particular file types. Decide what files must be protected (e.g., Oracle databases or Exchange records) and how often each file type should be protected. "You need to understand what files you have out there that need to be snapshot protected," Schulz says. "Some things may only have to be 'snapshotted' once a day or once a week -- other things may have to be 'snapshotted' on a more frequent basis."
Set realistic backup objectives. Snapshots still require backups, but it may not be possible to back up every snapshot, especially frequent snapshots. "Even if it's happening in the background and not disrupting the application, the backup system may not be able to cope with backing up a snapshot every hour," Taneja says. One solution may be to stagger backups -- perhaps backing up every forth snapshot rather than every individual snapshot. Backup frequency can also influence how soon you start overwriting the snapshot volume. For example, if you only backup every 12th snapshot, you will need the snapshot volume space to hold at least 12 snapshots because you certainly wouldn't want to start overwriting snapshot entries until a backup has been performed.