Backing up multiple terabytes of data

Achieving 1 TB/hour backup speed.

Recently, the Storage Networking Industry Association (SNIA) demonstrated how to achieve today's state-of-the-art of backup speed, one terabyte per hour!

At the Storage Networking Technology Center in Colorado Springs, demo teams carefully divided the data into 16 equal volumes of 64G bytes and used large tape libraries with sophisticated data movers and the fastest tape drives available today. Even so, few teams were able to reach the necessary backup rate. In order to achieve the goal, it is necessary to move data from disk to tape at the impressive sustained rate of 350M bytes per second.

But, if the technology allows just 1T byte per hour backup, how is it possible to backup a site with six to 10T bytes every day? Even incremental backups take a long time just to search for changes when talking about these amounts of data.

Today, it is very common to find sites with multiple terabytes, even midsize companies. On the other hand, the value of the data and the cost of data loss is constantly raising suggestions that backups should be performed even more frequently.

It is evident that the gap between backup speed and the amount of data is growing exponentially, while the frequency that this data needs to be saved is increasing.

What would happen in two years?
For companies that have this problem today, it will become a critical. For many others, it will become a new problem. The solution needs to come from technologies other than those used today for backup.

Backup is mostly used to prevent data loss in the following scenarios:
1. Storage device failures.
2. Site failures and disasters.
3. Software or human errors.

In the first two scenarios, redundancy of the hardware (i.e. disks, controllers, HBAs, fabrics) provides the most cost effective solution. The downtime or data loss costs are, in environments with multiple terabytes of data, much higher than the cost of the hardware redundancy.

But hardware redundancy doesn't prevent data loss when a virus, hacker, or software or human error occurs. For example, if a database is corrupted, it will also be corrupted at the mirror site, even if completely mirrored sites are implemented.

The proposed solution in the third scenario is to use multilevel snapshot technology. This makes it possible to create almost instant virtual copies of data at multiple points in time, without needing to move data. Multilevel snapshot, combined with hardware redundancy, provides a very scalable solution for protecting data in multiple terabyte environments. Due to the fact that a snapshot copy is created within seconds no matter what amount of data, it is possible to drastically reduce the potential cost of data loss and downtime by being able to do more frequent backups.

Today storage virtualization companies have developed capabilities for multiple points in time snapshots copies, while backup companies have the capability to manage catalogs of datasets that reside on tape cartridges.

What will solve the problem of the backup window?
Adding technology to backup software packages that will manage dataset catalogs residing on snapshot copies. In this manner, the backup software will be able to uniformly manage datasets that are placed in snapshot copies and/or in tape cartridges and intelligently move them from one to the other.

Once enterprise backup software vendors include storage virtualization techniques in their products, it will be possible to create a completely scalable solution for backing up multiple terabytes of data that can be used by the fast growing list of customers that enter into the backup window problem.

About the author: Nelson Nahum is a co-founder of StoreAge Networking Technologies and has been its Chief Technology Officer since the company's inception in April 1999.

This was first published in April 2002

Dig deeper on Data management tools



Enjoy the benefits of Pro+ membership, learn more and join.



Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: