This article can also be found in the Premium Editorial Download "Storage magazine: Expanding SANs: How to scale today's storage networks."
Download it now to read this article plus other related content.
|What's wrong with file storage?|
Targeting business users
Thus far, business applications, such as databases, have not mixed very successfully with shared file systems. That's because in business--where files are not nearly as large as in the aforementioned technical and creative markets--the latency introduced by the distributed lock manager overrides the performance of the SAN and the convenience of the global namespace.
Truskowski admits that its metadata server introduces latency--"that's a fact of life," he says. But according to former IBM chief scientist and SAN FS designer Randal Burns, now assistant professor at Johns Hopkins University, the way IBM architected its out-of-band clustered metadata server minimizes the latency issues that have plagued other clustered file system implementations.
Another problem of clustered file systems that IBM claims to have licked is the number of clients SAN FS can support. Professor Burns says that back in 1997, when he and other Storage Tank designers stood at the whiteboard, they originally envisioned a system that would support 10,000 clients, 1,000 servers and one petabyte (PB) of data. Contrast that to SGI's CXFS, which today limits the number of clients participating in its file system environment to 64.
Certainly, CERN expects to connect more than a handful of nodes with SAN FS. The particle physics research facility is exploring SAN FS to manage the data generated by its Large Hadron Collider (LHC), which is scheduled to be turned on in 2007. The collider, which will run for at least a decade, is expected to generate a jaw-dropping 10PB of data per year, largely in the form of image files approximately 1MB in size. These files document CERN researchers' attempts to recreate, on a small scale, what the universe may have looked like when it was first created. They will reside around the world on machines in the CERN grid where researchers will be able to sift through it "looking for odd events," says Francois Grey, CERN openlab development officer. "People talk about finding a needle in a haystack," he says. "This will be more like finding a needle in 10 million haystacks."
According to Grey, CERN openlab is always researching technologies that might be useful in its lab, including in the area of data management, and deems IBM's SAN FS as "the most promising" technology it has seen so far.
It stands to reason that if SAN FS will prove good enough for CERN to manage hundreds of petabytes of files, it should be more than adequate at managing the files of your average enterprise, most of which today turn to Network Appliance Inc. (NetApp) and EMC for large-scale file storage needs.
|The road to better backup|
Adding cost and complexity
But there are people out there that contend that SAN FS--and distributed file systems in general--add unnecessary cost and complexity to the business of storing and serving files. Whereas NAS systems are renowned for their reliability and ease of installation, distributed file systems are perceived as hard to implement, hard to keep running and for all but the biggest companies -- hard to pay for.
Randy Kerns, a senior analyst at the Evaluator Group, says one company he works with considered deploying a SAN file system, but dropped the project. The company had 80 or so servers that it was hoping to hook into the system. They calculated that after scheduling the necessary downtime to install the client-side code, installation alone would take about a year.
To be fair, one of SAN FS' central design points is its ability to coexist on an application server with another file system. IBM also provides data migration tools that can move data out of one file system and into SAN FS, on a per application basis, says Truskowski. "It's not an all-or-nothing proposition--you don't have to commit to do everything at once," he says, adding: "that wouldn't work for our enterprise customers."
But once the clustered file system is installed, are your troubles over? Probably not, at least in SAN FS' initial release. SAN file systems have the reputation, deserved or otherwise, of being finicky. "I won't kid you, this stuff is hard to do," says Paul Rutherford, vice president of technology at ADIC. "Especially when you start to add heterogeneous clients."
At SGI--which added support for non-SGI IRIX clients recently--Gabriel Broner, senior vice president and general manager of the storage and software group admits that "we had some rough times" ironing out compatibility issues, although he is now confident of CXFS' stability in a heterogeneous environment.
But Steve Kenniston, a technology analyst from the Enterprise Storage Group, argues that IBM is fully aware of these issues, and as "a trusted brand," is "at more of a risk if it doesn't work than if it were just a startup." In other words, IBM has a big incentive to ensure that everything works as advertised. And overall, Kenniston says, he was "pretty impressed" with the way IBM "took a look at all the major issues" including file-locking, multipathing, cache and timeouts.
And no one can accuse IBM of biting off more than it can chew in its first release. With release 1.0 limiting its support to IBM and Windows environments, initial SAN FS solutions will most likely be sold largely as bundled solutions, probably as part of a push to get customers to consolidate on Shark, Kenniston suggests. And make no mistake--this stuff won't come cheap, with many installations coming in at "a million dollars and north of that," says Kenniston.
Assume SAN FS installs without a hitch and runs perfectly. The question then becomes: Should you consider SAN FS, especially if you're not in the "lunatic market" as ADIC's Rutherford affectionately terms the technical computing markets that have traditionally deployed distributed file systems?
The one problem that SAN FS--and all distributed file systems--really do fix is the eternal "I love my first filer, I hate my tenth" problem. That cliche, usually applied to Network Appliance filers, arose because contemporary 32-bit file systems (and the NAS systems that are built on top of them) tend to top out around 2TB, explains IBM's Truskowski. And while that may have been a lot of capacity five years ago, these days, it's a drop in the bucket.
Sixty-four-bit file system technology, combined with a global namespace, can free users from these limitations. SGI's CXFS, for example, has a theoretical capacity limit of 18 exabytes--that's 18 million terabytes--although realistically, CXFS customers top out at capacities of hundreds of terabytes, says SGI's Broner.
This was first published in November 2003