For most U.S. enterprises, data protection will undergo a complete overhaul in the next 24 months. Simply put,...
the current data protection environment is at a breaking point and those organizations that must comply with new regulations have no choice but to revamp their backup and restore infrastructure.
Backup windows are shrinking as our Web-based economy expands and organizations do business around the clock. A major user study conducted by the Taneja Group, Hopkinton, MA, in conjunction with Storage in September 2003 showed that 67% of the respondents had a backup window of eight hours or less. (See "Plan on disk-based backup") Yet the amount of data to be backed up daily continues to mount. At the same time, businesses have adopted a zero-tolerance policy for data loss and have instituted stringent recovery time objectives.
During the past 20 years, there hasn't been much change in the way businesses perform backups and restores. But cheap disk prices, along with the commercialization of content-addressed storage (CAS)--which reduces the amount of data that needs to be protected--are fueling major changes in the way businesses protect their data.
Here's the payoff: If users choose wisely, they can improve the reliability and speed of their backups and restores by at least an order of magnitude. The Taneja Group classifies next-generation backup and restore products into the following categories (with some falling into several categories):
- Minimize disruption to the current environment
- Reduce data
- Provide continuous backup and restorability
- Eliminate tape
- Use replication technologies for backup
The most common backup and restore environment today contains a mixture of network-attached storage (NAS), storage area network (SAN) and direct-attached storage (DAS) (see " Typical backup environment"). The daily grind of incremental and full backups means that when the backup occurs, the data flows from the application servers over the LAN (or SAN) to the backup server, and then into the networked tape library.
However, once disk is introduced into the infrastructure as a backup medium, the situation changes dramatically (see "Pros and cons of new backup and restore solutions"). Vendors such as Advanced Digital Information Corp. (ADIC), Alacritus Software, Diligent Technologies Corp., Neartek Inc., Quantum Corp. and Sepaton Inc. have developed technologies that virtualize disk arrays and emulate a tape library.
|Pros and cons of new backup and restore solutions|
The ADIC, Quantum and Sepaton offerings are delivered as prepackaged appliances that include disk storage capacity. Alacritus, Diligent and Neartek, on the other hand, are software-based tape virtualization products that are capable of running on a standard server platform with a variety of disk storage on the back-end.
When virtual tape is brought into the environment, the disk system acting as a virtual tape library becomes the target device for the backup server instead of the physical tape library. Otherwise, the environment doesn't change. Once the backup completes, the application servers are free to continue regular operations. Data from the secondary disk, under the command of the regular backup application--whether it's IBM's Tivoli Storage Manager (TSM), Legato's NetWorker or Veritas Software Corp.'s NetBackup--can be copied from disk to tape (cloning) or moved from disk to tape (staging) based on user-defined policies in the backup application.
Tape virtualization and emulation technologies minimize the level of disruption to existing backup procedures, policies and the underlying backup infrastructure when disk is introduced. What's more, users aren't required to purchase disk backup options from their backup application software vendor. When considering tape emulation technology, a user needs to weigh the trade-offs between more open software-based approaches and prebuilt appliance models with back-end disk capacity. Users should also confirm that the required tape library and tape formats are supported by the specific vendors.
Disk backup options
All major backup application software vendors now have disk backup options that take full advantage of the unique properties of disk as compared to tape: random access, speed and relative reliability. Instead of laying down data in tape format, the backup application takes advantage of the random access nature of disk to lay down data consecutively for more rapid restores. In addition, backup applications now support concurrent read/write operations reducing the time required for restore, staging and cloning operations.
Legato NetWorker has introduced an advanced file type device for this purpose. Products like Veritas NetBackup 5.0 enable users to create a synthetic full backup from a full backup and a series of incrementals, dramatically reducing the need to perform a full backup on a regular basis.
To take advantage of the random access capabilities of disk, for example, backup applications multiplex streams of data from several application servers (see "Multiplexed data on tape and disk"). The data is placed on the tape in a mixed fashion, but consecutively on disk. For users who are happy with their existing backup software vendors and their future product direction, these offerings may be very appealing.
Data reduction is a great way to reduce secondary storage. On average, the ratio of backed up data to primary data is 10:1. What if the same amount of information that is buried in 10TB can be squeezed into 1.2TB of secondary disk? This is now possible with CAS technology, which is sometimes called object-based storage. Although CAS technology was first commercialized by EMC Corp. with its Centera and targeted at the archival market; other vendors, notably Avamar Technologies, have applied CAS to create an innovative backup and restore solution that is focused on reducing backed up data stored on secondary media. Data Domain, Permabit Inc. and a number of other newcomers have also entered this market.
Here's how Avamar, for example, cuts down the amount of stored data. The first time the product is used, each file is broken into variable sized chunks of data (not to be confused with blocks). Each chunk is usually stored only once--or a few times, depending upon one's requirements for additional protection--and a file is re-created on the fly from chunks, independent of their location. The agents on the application servers (clients to the backup application) determine if a chunk already exists on the secondary storage. If it does, it isn't transmitted across the network.
Given the high level of duplication in full backups and even incrementals (when a single bit is changed many products transmit the entire file over again), the reduction of data is massive. Avamar claims a 99% reduction in network traffic when compared to traditional backup methods. This powerful data reduction capability makes possible a variety of new data protection schemes, including the cost-effective replication of data to secondary disk storage at a disaster recovery site over IP. In some enterprises, the technology has the potential to eliminate tape completely as a backup medium.
Although CAS promises to drastically reduce the ratio of primary to secondary storage there are downsides. Avamar, for example, requires that users replace their existing backup application with Avamar agents, and all backed up data is stored in a format that only Avamar can decode.
These data protection approaches--when used in conjunction with snapshot technologies--only provide data restorability to specific points in time. For example, if snapshots are taken hourly, an average of 30 minutes of data may be lost. Unfortunately, the only viable alternative to date had been to increase the frequency of snapshots. But now, companies such as Revivio Inc., Timespring Software Corp., Vyant (now Mendocino), XOsoft and others are delivering solutions that provide data protection at any point in time.
Revivio, for example, focuses on critical application environments that have traditionally relied on expensive array-based snapshot technologies like EMC's TimeFinder to provide point-in-time copies of application data. To minimize the risk of data loss, administrators must create multiple concurrent business continuance volumes (BCVs) on their Symmetrix arrays, which is very costly. And if the primary volume gets corrupted, the recovered volume could substantially lag behind the primary volume and data will be lost.
The Revivio solution deals with this problem by attaching its appliance to the SAN and creating a third mirror of the live volume. With this technology, Revivio claims that there's no need to create BCVs or maintain mirrors of BCVs. The result? Your storage needs just dropped dramatically. Every write to the primary storage is captured (no agents are required to be placed on application servers), time stamped and redirected to the Revivio appliance (see "A typical Revivio setup").
If the primary volume gets corrupted--say at 2:30 p.m.--the IT administrator can ask for the volume as of 2:29 p.m., and the Revivio appliance will re-create it from the combination of the third mirror and its TimeStore software by essentially deleting all writes that occurred between 2:29 and 2:30 p.m. The volume as of 2:29 p.m. is mountable directly from the appliance as soon as it is ready (a few minutes) and the application can be restarted. Now, the 2:29 p.m. volume is fed into the primary storage, and as a second step brought into complete harmony with the current state. At this time, the triple mirror and TimeStore are returned to their regular responsibility of data protection and the primary volume takes over. Other products apply variations of this theme, focusing on Exchange and SQL Server environments.
These solutions aren't for everyone. Only the most critical applications--especially those where downtime costs are astronomical--need this level of protection. Another fact worth noting is that while these applications will reinstate a volume at any point in time in the past, its consistency is only guaranteed as crash consistent. This means that the volume is presented as it was at that moment in time. Because application consistency is what one ultimately cares about, a crash-consistent image takes you a step closer, but it's not a full solution. The administrator may still need to apply logs in order to create the right total picture for a database application. The only way to get application-level consistency is to take point-in-time images. And then you are back to where you started.
Revivio only comes into play when volumes need to be recovered, and later resynchronized. The solution is used for volume--not file recovery--and is most effective for database applications. Other vendors--XOsoft, for example--are claiming application-level consistency and file-level recovery, so it's crucial to drill down and understand these critical differences.
The elimination of tape
Should you discard tape if you install a D2D solution? The answer in most cases is probably not yet, maybe never. Why? Because most organizations are so procedurally and emotionally attached to tape that it would be too drastic to eliminate it. Of course, the answer depends on what new backup solution is picked. In the case of an Avamar, for example, where data is radically reduced, you could replicate data over long distances and store huge amounts of data directly on disk. But in the case of Revivio, for instance, you probably would still want to back up from the appliance to tape, duplicate the tape and send one set offsite.
Until recently, replication technology has almost invariably been associated with business continuity and disaster recovery. Data is first protected on the local site using regular backup to tape and specific volumes that are considered critical are replicated to a remote site, using technology such as SRDF from EMC or IBM's Peer-to-Peer Remote Copy (PPRC). These tools typically replicate data between arrays on a synchronous basis--the local data is not written to disk until the remote site has signaled that it has received and written the data to its disk system. In more cost-sensitive environments, users deploy host-to-host products such as Veritas Volume Replicator (VVR) and NSI Software's DoubleTake. These work asynchronously and are independent of the underlying storage platform, giving users the advantage of replicating from one type of disk system to another. While the distance limitation of synchronous replication is eliminated, most asynchronous products have at least some conditions under which they deliver inconsistent data on the remote site. Overall, they still provide reasonable protection levels for many less than mission-critical applications.
Thus far, backup and replication technologies have been distinct from each other, but a new crop of products is blurring that distinction. Such products are being delivered by next generation replication-focused vendors like Kashya and Topio, or by virtualization software players such as StoreAge Networking Technologies and FalconStor Software and by virtualization appliance vendors such as Candera Inc., DataCore Software Corp. and Troika Networks. Using different methods, these products are able to significantly reduce the data that's transferred, so that consolidated remote backup becomes a feasible concept. Instead of doing a local backup--creating tapes for local and offsite storage and replicating the most important data--why not combine this functionality and just back up to a remote site, disk to disk and eliminate tapes altogether?
In particular, if a company has a large number of remote sites that have little or no IT presence, why not backup to a central site where IT personnel can apply common methods and policies across the enterprise? It's important for users to validate a vendor's ability to maintain data consistency across multiple data sources. The application of replication technology for backup consolidation purposes will create new avenues for inexpensive data protection for many businesses, particularly midsized companies.
A new era
A new era of data protection is fast approaching. As a result, most companies will revamp the way they have done backup, restore, replication and archiving within the next two years. Small storage companies that "don't know what can't be done" are creating the foundation for these innovative ways to protect data.