Home > Storage Technology Tips > > High performance computing demands special backup approach
Storage Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 


High performance computing demands special backup approach


Alan R. Earls
11.12.2008
Rating: --- (out of 5)


News and trends in the storage industry
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


According to analyst firm IDC, the market for high-performance computing servers will reach $15.6 billion by 2012. But for storage administrators, the growth of the HPC server market translates into unique backup challenges, created by the special requirements of HPC.

HPC raises two issues when it comes to backup and disaster recovery preparation: the volume of data and the volume of files.

The workload or data volume generated by HPC applications can be very large when dealing with files containing seismic or genomic information. "Those files can be incredibly large," says Gartner analyst David Russell. "Traditional backup approaches may not be adequate or may simply take too much time." For example, he notes, some HPC files can be in the petabyte range.

Some HPC applications also generate exceptionally large numbers of files – "literally millions," according to Russell. "The challenge of how you account for those files or the time it might take to go through an operating system and traverse the file system to see what files have changed is very much a 'heaving lifting' task." Getting that data on disk, or simply just getting it through the server and switch, might take too much time. In short, he says, applying traditional backup tools directly to HPC tasks can be a formula for disaster.

As an alternative to traditional backup tools, Russell says that an HPC administrator could combine technologies such as array-based snapshots and remote replication with data reduction techniques such as deduplication. However, says Russell, not all workloads today benefit from deduplication. For instance, an image that is already in a compressed state usually cannot be reduced further.

Still, vendors offering compression techniques, such Ocarina Networks, "have figured out how to reverse-engineer giant files and look for redundancies," says Russell, and there may be ways to further improve the process.

But the number of files in HPC environments is still a major challenge for backup administrators. "If you have a million I/O cycles for a million files, the effort of interrogating all those files, even with a nightly update, will take a long time," says Russell. ""I've heard of some HPC applications where it took 30 hours to do a full backup and 28 hours of that was just spent scanning to see what files had changed."

In a world with no resource constraints, a storage administrator would have the necessary disk, power and floor space to handle all these backup tasks, says Russell. But what makes it even more difficult is that HPC environments are usually oriented towards scale-out, with lots of servers crunching data. That implies the need for tightly coordinated backup, because, notes Russell, "You don't want different points in time on 25 different servers." Backup can be coordinated, he notes, through "brute force methods" that flush buffers and set a machine check point.

HPC can bear small amounts of downtime
David Hill, an analyst with storage analyst firm The Mesabi Group, points out that for many HPC applications, small amounts of downtime would not be noticeable to the user because many compute-intensive jobs are actually batch jobs. That means the user will not see the results until the job has run to completion. "For a 1-hour-plus job, would five minutes missing in the middle be noticeable?" asks Hill. "The answer is no."

According to Hill, "What these types of jobs really need is checkpoint/restart capabilities, where the state of the memory in the computing environment is written to disk periodically so that it can be restarted."

Depending on the value of timeliness and the value of the data, Hill says that businesses doing HPC might also be willing to consider an active-active failover strategy to a remote disaster recovery site for both operational recovery from a local problem as well as disaster recovery to recover. Another option, according to Hill, is performing continuous data protection (CDP) locally, combined with a virtual tape library (VTL) and a standard backup-restore packages.

About the author: Alan R. Earls is a Boston-area writer focusing on the intersection of technology and business.


Rate this Tip
To rate tips, you must be a member of SearchStorage.com.
Register now to start rating these tips. Log in if you are already a member.




BROWSE BY TAG
Storage Strategy,   HPC storage,   VIEW ALL TAGS

Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google



RELATED CONTENT
HPC storage
Hewlett-Packard buys Ibrix for scale-out NAS; takes another step into clustered storage
Storage Decisions Chicago 2009 Session Downloads
"Despereaux" posed steep challenge for clustered storage system
Supercomputer segregates storage systems to provide large-scale HPC
Storage Decisions San Francisco 2008 Session Downloads
Supercomputer's storage chosen for density, not performance
iQstor bulks up its rack-mounted SAN storage system
Parallel file systems become requirement for HPC environments
NetApp takes aim at HPC market with FAS upgrades, caching devices

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary

DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



Find Data Backup Analysis
TechTarget Storage Media
Storage Magazine View this month\\'s issue and subscribe today.
Storage Decisions Apply online for free conference admission.
SearchStorage.com
HomeNewsMagazineTopicsLearningMultimediaWhite PapersBlogsEventsAbout Us

About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2000 - 2010, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts