Preserving the world's greatest books is no small task for the libraries entrusted with their care. Storing and...
sharing those same precious volumes digitally is just as daunting a task. The Folger Shakespeare Library can be assured that the digital version of Shakespeare's "First Folio" will not be a comedy of errors. Thanks to Oakland, Calif.-based Octavo, "The Comedie of Errors" is not only stored accurately but is also available to the public for viewing.
Octavo's mission is to assist libraries and private collectors in digitizing, storing and publishing the volumes in their collections. To be successful, Octavo must use technologies that can accurately store digital replicas of these documents for a long time. In this case, long means a millennium.
"The Library of Congress, for example, wants to be sure that data is kept correctly for hundreds of years," according to Hans Hansen, senior technologist for Octavo.
The challenge in accomplishing this goal is that "there is not a digital media that is reliable in the long term," said Hansen.
Four-year-old Octavo started out storing its library partners' data on CD-R media. They used a "particularly good" CD-R media that held up for several years, Hansen said. Even so, the data would degrade while in storage and needed to be refreshed on a regular basis. So, the company started with three copies of every CD and made a fresh set every year. "We had shelves in our offices holding tons of CDs," said Hansen.
Making duplicate copies and refreshing them all regularly was labor intensive and relatively expensive.
With this system, real-time data accessing was not possible. If a librarian needed access to a digital volume, she would have to contact Octavo, which would pull a CD containing the requested volume and send it out.
These problems were bad enough when Octavo's storage archives held 1 terabyte of data. The company's plan to offer a new digital camera lab system to their library clients promised to significantly increase the archive's load. "In five years we could be looking at 1,000TB of stored images," said Hansen.
Hansen uses the example of the Gutenberg Bible to illustrate Octavo's future storage requirements. Every single image taken by the new digital camera labs is at least 380MB in size. The Gutenberg Bible alone will require about 650 images. That's almost a half a terabyte for one book.
With this storage growth projection in hand, Octavo went searching for a storage vendor that could offer secure, scalable and reliable data management systems. "Frankly, there wasn't a long list of vendors with the features we needed," said Hansen.
San Francisco, Calif.-based Scale Eight, however, had a long list of the right features, said Hansen. Its technology promised to be flexible, cost effective, secure, accurate and scalable.
Scale Eight's Global Storage Ports can be mounted directly on the Octavo's camera lab workstations. This gives operators instant access to images as soon as they are captured. This is especially useful in library projects with many people involved, such as librarians managing the project, conservators guarding the object itself, and scholars looking at the details and annotating with bibliographic information. Everyone can instantly make sure that the document is captured accurately, Hansen said.
Scale Eight stores data on a live hard drive system rather than offline. Hansen believes that is the best way to store data accurately. "The data stored by Scale Eight is constantly being refreshed," he said. "That the data resided online, all the time, is very critical to us." Another plus is that Scale Eight monitors their systems very closely.
Octavo's data is mostly uploaded to Scale Eight's Santa Clara location, but it is also mirrored to Scale Eight's Virginia location. Having the data in completely separate geographical locations provides assurance that data will not be lost if a disaster occurs at any one location. "The way the data is distributed across their servers is very valuable to us," said Hansen.
The scalability offered by the Scale Eight system was crucial to Octavo. "Scale Eight has essentially unlimited storage available to us," said Hansen. "We won't have to lift a finger to get the added storage we'll need in the coming years."
Scale Eight's ability to support heterogeneous systems also fit the bill for Octavo. Like many graphics-oriented companies, Octavo uses Macintosh-based workstations. The camera labs' custom Online Capture Systems application runs on Linux-based Web servers and the company's image server runs Microsoft Windows NT.
Scale Eight's innovative security technology offered Octavo and its institutional partners "peace of mind," said Hansen. Each file stored by Octavo is automatically given an authenticated Scale Eight URL, or 8RL. The 8RLs give the public access to the thirty-plus volumes on Octavo's Web site, but that access is one-way.
"The public can retrieve data, but they can't get into the system and muck with it," said Hansen. "It's very elegantly done."
Doing their homework before going with the Scale Eight system, Octavo's technologists researched "at length" what it would take to create similar capabilities themselves, said Hansen. The scales were balanced heavily in Scale Eight's favor. "$100,000 in Scale Eight services would cost us a $1 million to create," said Hansen. "Naturally, we decided not to do it ourselves."
For more information about Octavo, visit its Web site.
For additional information about Scale Eight, visit its Web site.
For more information:
- SearchStorage Best Web Links: Storage Management
- SearchStorage Best Web Links: Backup/Archives
- SearchStorage Tips: Storage Management