Experts: Make tape technology work for you

SearchStorage.com recently asked several industry tape experts to tell us about some of the common tape issues administrators/managers face every day and how best to solve them. How can you improve on your tape-based backups? How can you avoid the most common backup failures? The experts weigh in on these questions, and share a few best practices. Read insights from the following tape industry veterans:

Brian Avakian, Regional Manager, Data Center Solutions, Imation Corp.
Dave Thomas, Manager Field Engineering, DLTtape Group, Quantum
LTO Program participants -- Bruce Master, Senior Program Manager, Worldwide Tape Products Marketing, IBM; Rick Sellers, Product Marketing Manager, HP; Debra Redmond, Senior Manager, Corporate Communications, Certance and Tom Yuhas, Director of Data Systems for Sony Electronics..

Editor's note: This interview is part of a larger featured topic, "Improving tape performance" on the ins and outs of tape technology.

This Content Component encountered an error
What specific steps can people take to improve on the success of their tape-based backups?

Automate with a proven tape management system to help eliminate the chance for human errors. Invest in reliable proven tape technology. LTO technology has data integrity features like read after write verification, advanced error correction code and timing based servo mechanisms designed to help ensure the data is protected and stored accurately.
What are some of the most common reasons you see for tape backup failures?

The most common reasons for tape backup failures fall into one of three categories: Staffing, training and tape management issues.

Staffing issues relate to the changes we have seen in the industry in the past decade or so. The operations staff responsible for tape handling today can be categorized in two ways -- the career operators, and the transient work force. In the past decade the tape operations position has morphed from an entry port into the IT field to more of a manual labor job that competes with the fast food industry for the same staff. In the case of career operators, the issues revolve around the ever-changing infrastructure of the environments and the training involved with staying abreast of those changes. For the transient work force the issues are also training related but also include such things as finding and retaining staff that can and will implement the standards required for any successful data center.

Training, as with any variation of staff, there is a need for continual training in new systems, technologies, methodologies, etc. The problem is data centers everywhere have reduced headcount to the exact number of people to perform whatever task required for their business and training budgets are hanging on the ropes ready to fall. Time and money really needs to be set aside to protect the corporate assets in the form of data.

Tape management, unfortunately, is nonexistent for many of the popular tape and system formats used for mission critical businesses today. Without a formal tape management system and methodology all tape processing including the tape backups are subject to the unknown quality of the media being used. Tape must be managed for adequate quality in use. For tape, temporary errors often lead to permanent errors that relate to job failures. The idea is to remove and replace the failing tapes before the permanent error occurs and causes a job failure.
What specific steps can people take to improve on the success of their tape-based backups?

Observe manufacturers' specifications for drive operation (particularly environmental limits for things like temperature, humidity, and air-borne contaminants) and for media storage/handling.

Reading and understanding the system and ISV software logs is one way to avoid failure since often there are signs of failure long before a backup actually fails. However, this can be time-consuming and the log pages are often difficult to interpret.
What specific steps can people take to improve on the success of their tape-based backups?

Understanding the data is the one most important and wide reaching exercises that can be applied to an operation. By knowing the data associated with a business, the business can better be protected from potential disaster. Historically, the backup routines have been associated with "system" backups where mission critical systems are identified and backed up for security or protection. In today's disparate environments businesses are supported by multiple data feeds. To prioritize a system and backup the system where the application for a business resides may not offer the protection desired if the data is being supplied in whole or part from other disparate sources.
What are some of the most common reasons you see for tape backup failures?

Improper storage or mishandling of media is one of the common causes of backup failure. While DLTtape media is designed for durability and reliability, there are some basic steps that can be taken to avoid damage to the cartridge and media. It's important to respect media manufacturers' recommendations in terms of media inspection, handling and storage requirements.

Tapes that have been over used can also cause problems. DLTtape media is designed for heavy duty cycle environments and lasts considerably longer -- it's specified for 5000 insertions -- but it's still important for an IT administrator to monitor media usage.

System hardware and software can also be an issue: Simple things like exceeding the specified SCSI cable length or having a SCSI terminator missing can cause the backup to fail. So it's important that an administrator makes sure the system is compliant with the standards required for the level of SCSI interface in place.
What are some of the most common reasons you see for tape backup failures?

Human mistakes (e.g. wrong cartridge, wrong data set, wrong policy, etc.) and drive/media failures.
Many users backup to tape over network connections, via storage area networks (SANs) or more traditional LAN connections. What tips can you recommend for reducing network or server overhead during the backup process?

Again, know your data. By knowing what data is required to protect, as well as the data that is most valuable to your business, an operation may be able to reduce the amount of traffic or arrange the traffic patterns to reduce or eliminate bottlenecks in the backup routines. Utilizing the LAN or general system network to move the data for backups can also generate contention or bottlenecks for the backup cycle. Utilizing channel attached devices or a dedicated network for backups can alleviate the traffic jams often experienced with pushing the quantity of data a backup routine often experiences across a communications network.
Many users backup to tape over network connections, via storage area networks (SANs) or more traditional LAN connections. What tips can you recommend for reducing network or server overhead during the backup process?

First of all, be realistic about SAN/LAN throughput and plan your hardware accordingly. Fast tape drives can easily saturate network/SCSI connections, so it's important not to cut corners on the bandwidth of your backup system.

Also, be realistic about server throughput -- server throughput is frequently much less than theoretical disk drive throughput, given the overhead associated with fragmented file systems.

Then, if your infrastructure (and budget!) permits it, use dedicated data paths from server(s) to tape drive(s).
Many users backup to tape over network connections, via storage area networks (SANs) or more traditional LAN connections. What tips can you recommend for reducing network or server overhead during the backup process?

Quiesce the server applications (i.e. Oracle, DB2, Web Services, etc.)
What are some of the simple, best practices that tend to be overlooked when backing up to tape?

Decisions for the tape backup routines must follow two simple rules:

1. Restorability. Can I restore from the backups in a manner acceptable to the business requirements?

2. Identify the true mission critical elements of the business required to produce the service or product that is either profitable or required by law or SLA. Once identified make sound business decisions based on this information.
What are some of the simple, best practices that tend to be overlooked when backing up to tape?

Best practices should include:

1. Know your data. Different types of data need different protection strategies. Know the importance of data and its tolerance for unavailability.
2. Employ a multi-level data protection strategy comprised of mirrors, clones and tape.
3. Do full backups and incrementals to tape.
4. Test your restore capabilities often.
5. Choose proven technologies with an assured future.
What are some of the simple, best practices that tend to be overlooked when backing up to tape?

Proactively manage your media. Monitor its performance and replace it if it does not perform.

Conduct test restores. An IT administrator's biggest fear is often "Will the data be there when I need to restore it?" By conducting test restores an administrator can test the integrity of his backups, as well measure how long it takes to get the data back.

Regularly review your backup strategy as your storage needs grow and you add servers the system can rapidly outgrow the backup infrastructure.
How can administrators improve on their recovery time from tape? What are some of the "gotchas" to avoid when planning for recovery?

A good start is to ask yourself what is required to be restored and in what time frame? Once these questions are answered the proper technology and the proper coordination of data can be applied, i.e. what data can be stacked on the same physical media and what data needs to be separated to expedite a restore when required.
How can administrators improve on their recovery time from tape? What are some of the "gotchas" to avoid when planning for recovery?

Restoring a system from one full backup and multiple incremental backups can take a long time! Some smaller companies just do full backups, in part to avoid this. Bigger companies with larger data sets may want to look at doing an image backup for quick recovery. Of course, if you can't afford any downtime at all you'll need a mirrored disk system but remember, disk mirroring is not backup -- virtually all companies that have disk mirrors also use tape, since tape is remains the most effective way to get a disconnected offline copy your data.

Tape transfer rates are now approaching those of hard disk drives with the result the largest part of recovery time can come from locating and loading the correct tapes. As a result, if administrators store their backups in a secure offsite location, it's important to have a second copy of the data that stays onsite. The offsite copy will cover them in the event of a disaster that destroys the data center (fire, flooding...), while the onsite copy will allow them to recover quickly from a less catastrophic system failure.
What are some of the most common reasons you see for tape backup failures?

From my experience, most tape backup failures are a result of human errors, software errors or scheduling conflicts.
How can administrators improve on their recovery time from tape? What are some of the "gotchas" to avoid when planning for recovery?

Recovery time improvements are primarily a function of the backup software, combined with the speed of the network and tape backup devices. I recommend that administrators provide ample tape backup storage in an automated tape library rather then shelf storage that requires manual intervention and can greatly slow the recovery process. Another way to overcome this predicament is to utilize HSM (hierarchical storage management) software that automates the data migration to tape and back again at the file level.
What specific steps can people take to improve on the success of their tape-based backups?

Successful backup begins with proper assessment of the backup needs and system design, usually with a focus on increased automation. Users can be more confident in their backups by implementing automated back up scheduling software. Such software allows for routine scheduling of host backups at full, incremental, and differential levels. The level of backup varies depending on the administrator's scheduling needs.
Many users backup to tape over network connections, via storage area networks (SANs) or more traditional LAN connections. What tips can you recommend for reducing network or server overhead during the backup process?

It's important to properly assess your organization's network bandwidth before backing up over it. Increasing end-user data demands on the typical network can slow the performance or stress the system's tape backup equipment.
What are some of the simple, best practices that tend to be overlooked when backing up to tape?

Time and time again, users forget or overlook disk caching or the use of consolidation disk architecture. Through these simple practices, users can increase the immediacy of their backup system while assuring system performance. Disk caching also lets users take full advantage of their tape drive's speed.
How can administrators improve on their recovery time from tape? What are some of the "gotchas" to avoid when planning for recovery?

By using a tape management software system and reliable tape technology the user may be able to restore a file from a specific cartridge without extensive searching that can typically extend the restore times. LTO tape systems employ very high data transfer rates that can help improve restore and backup job performance.

Making a backup onto tape is key. Removing the tape from the system is important to avoid any intentional or unintentional corruption of the data. Removing the cartridge to a fireproof vault is important and sending the data or a second copy of critical data to a secondary offsite location is the ultimate protection in the event the primary site is destroyed.

Dig deeper on Secure data storage

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchSolidStateStorage

SearchVirtualStorage

SearchCloudStorage

SearchDisasterRecovery

SearchDataBackup

Close