The San Jose company will make its data reduction appliance and software generally available next Monday, although it already claims to have customers using its products. The data reduction system includes three components: the Ocarina Optimizer appliance, which performs the compression; the Ocarina Reader software, which decompresses the files for vieewing; and the Ocarina Manager user interface.
The Ocarina Reader is a software agent that can be installed on workstations or servers in order to unroll the files, for later viewing, that the Optimizer has compressed. The Reader is needed to restore optimized files, and any Reader in any location can be used to read any file in Ocarina format.
The Optimizer and the Reader can be used with CIFS and NFS file systems on both monolithic or clustered NAS systems and support about 100 proprietary file formats, including Microsoft Office, video and image files.
Tackling precompressed and multimedia files
Once the Optimizer appliance has copied a file out of the primary data store, it pulls the file apart to extract the component objects. For example, a PowerPoint file could be broken down into text and .jpg or .png graphics. From there, each storage object is compressed, and redundant objects are consolidated across files. Finally, the Ocarina box returns the optimized file back to the primary storage device. The data is protected against corruption with a set of checksum algorithms inserted into the header of each file.
Ocarina claims a 10 to 1 data reduction ratio, an improvement on standard compression technologies that usually hit a ratio of about 2 to 1. The company achieves that ratio by pulling apart standard file formats, many of which are natively compressed, then applying proprietary algorithms to further compress the objects within those files. These algorithms also make it possible to create a three-dimensional cube of numeric values to represent a photo or video image.
"So if you take some photos on a day at the beach, our algorithms will be able to look for similar boundaries, such as light levels, that it's seen before," said Carter George, vice president of products. "It's like computer vision."
Photo-sharing site expects to save 200 TB
So far, Ocarina has named one user in the Web 2.0 multimedia market and claims more big names in that space are testing the product. Graham Hobson, chief technology officer of Photobox Ltd., a UK-based photo sharing and printing site, said his company has been testing the Ocarina products since their prealpha stage two years ago. He intends to put them into production beginning next week.
Photobox was founded in 2000, and for the next six years, Hobson said, the cost of data storage fell each year at a rate that kept up with the company's growth in storage capacity. But about two years ago, the company's growth in capacity began outstripped the dropping cost of storage. "We're dealing with bigger files now," Hobson said. The monthly cost of rack space in the company's colocation data centers in Europe has increased around fourfold because of increasing energy prices.
Photobox currently has about 800 TB of capacity on clustered NAS systems from Isilon, with about 600 TB of that used. "Without data reduction, we'll exhaust that capacity in the next three months," according to Hobson. With Ocarina, the company is hoping to make that capacity last through September, a savings of about 200 TB.
Primary data reduction still bleeding-edge
Ocarina is not the first vendor to market with a data reduction product for primary storage. Storwize Inc. also compresses data on primary NAS systems. However, Storwize sits in the data path and passes through files it can't optimize. The Storwize appliance is also needed to recover data, and a separate Storwize appliance is needed to receive and restore data at secondary sites.
Ocarina leverages this difference between the products to market its solution as less risky than Storwize's "bump in the wire" approach, something Hobson said gave him more confidence in the Ocarina product.
But Ocarina's data reduction philosophy isn't for everyone. In addition to worries about data corruption that secondary storage data deduplication makers have also faced, Ocarina's "computer vision" means it is literally reading files, which security-conscious storage managers might consider too risky. So for now, Ocarina's customer focus will be websites that store many photos and videos.