Meet Fotango, a wholly owned subsidiary of Canon Corp., and arguably one of the first users of IBM's SAN File System that's not among traditional high-performance computing environments such as universities and research labs.
Fotango is using SAN FS as the back-end file system for the free Canon Internet Gateway (CIG) service, where users can upload, store, share and buy prints of the digital photos they take with their Canon cameras. In the next five years, Canon anticipates one million European users, each allotted up to 100MB of space. That translates to 100TB of active data--if Canon does not increase users' space allotment.
When Stuart Fox, the infrastructure consultant on the project, started thinking about how to build this site, he didn't think of a clustered SAN file system. At the time, "we were looking for things that we weren't sure existed." Initial estimates of how much capacity they would need were dramatically lower--30TB or so--which "everyone can handle." But "every week, that number would be increased," Fox says. Today, Fox estimates that 1 petabyte is all the online capacity Canon will ever need, both for CIG applications, as well as any other applications they dream up.
They did know what they wanted: data to be online--not off on an optical disk somewhere. They knew "we could never, ever lose data." And they knew they wanted a single file system so that users could share photos with friends and family without having to copy the files. Instead, Fotango could enable sharing with Unix hard links--symbolic links to a single file--which don't work across multiple file systems.
In a test to determine how much space they save with hard links, Fotango evaluated 500GB of user data. Initially, Fox believed that approximately 35% of the data was shared, which if expanded, would turn 500GB into 675GB. Instead, 500GB turned into 1.35TB--an increase of 170%! Clearly, "sharing is one of the most popular features on the site," Fox says.
Eventually, Fotango picked IBM as its infrastructure provider, and initially ran GPFS on AIX. As of this month, SAN FS and SAN Volume Controller (SVC) will sit behind 70 or so xSeries application servers. The data resides on a FAStT700 and FAStT200s, but eventually, Fotango will implement storage tiers--a hot, warm and cold pool--where ATA-based storage will house infrequently accessed or old data. "The ability to do policies really sold us on SAN FS," Fox says.
SAN FS has also proved to be stable and very fast, Fox says. "I have a dedicated SAN FS engineer whose only brief from me was 'Break it,' and he couldn't." Furthermore, Fotango has seen 550,000 IOPS on its two-node cluster running SAN FS, as compared to the 125,000 IOPS IBM promised it.