Home > Storage Magazine > Features > Big files create big backup issues
EMAIL THIS
Storage Magazine

  CURRENT ISSUE  

  FEATURES  

  TOOLS, TRENDS & ANALYSIS  

  COLUMNS  

  ARCHIVES  

  SUBSCRIBE/RENEW  
 

Big files create big backup issues
by Stephen Foskett
Issue: May 2008
printer-friendly
< PREV PAGE   |   1  |   2  |   3  |   4  |   NEXT PAGE  >

Consistency and timeliness
Some backup systems are better at handling issues than others, but all will have difficulty when faced with a single file system with millions of files or hundreds of gigabytes of data to back up. Although the point-in-time consistency across different files in large file systems isn't always required, it can be critical; during an eight-hour backup, while the application is running, there may be inconsistency problems with files the application uses and some in the backup. The prime solution to the problem of consistency is to cheat the constraints of time by creating a snapshot copy of the data to be backed up. Leveraging the technology included in many storage arrays and OSes, a snapshot-based backup can freeze the data set at a point in time and copy it to tape at its leisure. This technology ensures that the entire set of files to be backed up is consistent with respect to changes over time. But snapshot technology isn't a native component of a backup application, and the particular type used must be supported by the backup system or custom scripting is required.


The problem with big backups
No matter how you slice it, backing up big file systems is a problem:

  • Backup applications need a few moments to examine each file and determine if it should be backed up or not, and another moment to store a record of each backup in the database. Multiply these moments by a few million, and they add up quickly.


  • Massive files generally can't be backed up in parallel, and traditional backup approaches copy them in their entirety even if just a few bytes have changed.


  • Even if you can wait for the backup to complete, the backup copy might not be consistent with the latest copy of the file.


  • Data is backed up so that it can be restored, but many methods for speeding backups make recovery time unacceptably long.
...




A massive number of files
Sheer numbers can overwhelm any backup product (see "The problem with big backups," above). Sean O'Mahoney, manager of client/server information systems at Norton Healthcare in Louisville, KY, saw his Meditech Electronic Medical Record (EMR) file server grow to contain more than 25 million files in 1.3 million directories. "It took almost five hours just for Windows to count the files," explains O'Mahoney, "but we have trimmed the backup time for this half-terabyte LUN to around three hours." The fix was a simple one: Ignore the files and dump raw disk blocks to tape. Although it lacks an index of files, this solution fits fine because all of those files are part of a single massive app.


< PREV PAGE   |   1  |   2  |   3  |   4  |   NEXT PAGE  >





TechTarget Storage Media
Storage Magazine View this month\\'s issue and subscribe today.
Storage Decisions Apply online for free conference admission.
SearchStorage.com
HomeNewsMagazineTopicsLearningMultimediaWhite PapersBlogsEventsAbout Us

About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2000 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts