Big archives need big planning - Storage Technology Magazine

Big archives need big planning

Building out an archive requires a lot of planning if you want to be able to manage it as it balloons in size.

"One [issue] we've seen get people is retrieval," says Jim Cuff, VP of engineering at Boston-based Iron Mountain, which provides a variety of electronic archiving and vaulting services. Working on the assumption that they're building a low-access archive, "they get caught flat-footed" as it grows and are unable to retrieve data at the rates they can write.

Another issue is logistics: How do you procure, power, cool and protect that much disk? Iron Mountain has customers who archive 250GB daily, which at first "sounds like a conventional IT problem," Cuff says. But over, say, seven years, that 250GB per day approaches 650TB. What happens if that 250GB per day becomes 500GB? "You're in a different problem domain quite by accident," notes Cuff.

Massive array of idle disks (MAID) storage is one technology that may help the archive cause. In a nutshell, a MAID array spins down disks that aren't being used to reduce wear and tear, lengthen their lives, and save on power and cooling. An example of a MAID array is Copan Systems' Revolution 200T.

Now the question becomes "How do you manage the data?" Today, most shops simply front the archive with several large file servers and get around scalability limitations that traditional file systems present (e.g., inode and directory object limitations) through application code. But Iron Mountain, for one,

    Requires Free Membership to View

    When you register for SearchStorage.com, you’ll also receive targeted emails from my team of award-winning editorial writers. Our goal is to keep you informed on the hottest topics, the latest news and the biggest challenges you face as a storage professional today.

    Rich Castagna, Editorial Director

    By submitting your registration information to SearchStorage.com you agree to receive email communications from TechTarget and TechTarget partners. We encourage you to read our Privacy Policy which contains important disclosures about how we collect and use your registration and other information. If you reside outside of the United States, by submitting this registration information you consent to having your personal data transferred to and processed in the United States. Your use of SearchStorage.com is governed by our Terms of Use. You may contact us at webmaster@TechTarget.com.

is "very interested in the global file system metaphor," says Cuff. Examples include ADIC's StorNext Management Suite and IBM's SAN File System.

But not everyone is sold on distributed file systems. In a joint research project with the Cornell Theory Center, Unisys ruled out the clustered file system approach "because we saw that it couldn't scale past 30TB or so," says Dr. Michael Salsburg, director for systems and technology at Unisys. "Data clusters look good and look inexpensive," he adds, but as the amount of storage grows, "you get totally bogged down with the communication."

This was first published in February 2005