Truskowski admits that its metadata server introduces latency--"that's a fact of life," he says. But according to former IBM chief scientist and SAN FS designer Randal Burns, now assistant professor at Johns Hopkins University, the way IBM architected its out-of-band clustered metadata server minimizes the latency issues that have plagued other clustered file system implementations.
Another problem of clustered file systems that IBM claims to have licked is the number of clients SAN FS can support. Professor Burns says that back in 1997, when he and other Storage Tank designers stood at the whiteboard, they originally envisioned a system that would support 10,000 clients, 1,000 servers and one petabyte (PB) of data. Contrast that to SGI's CXFS, which today limits the number of clients participating in its file system environment to 64.
Certainly, CERN expects to connect more than a handful of nodes with SAN FS. The particle physics research facility is exploring SAN FS to manage the data generated by its Large Hadron Collider (LHC), which is scheduled to be turned on in 2007. The collider, which will run for at least a decade, is expected to generate a jaw-dropping 10PB of data per year, largely in the form of image files approximately 1MB in size. These files document CERN researchers' attempts to recreate, on a small scale, what the universe may have looked like when it was first created. They will reside around the world on machines in the CERN grid where researchers will be able to sift through it "looking for odd events," says Francois Grey, CERN openlab development officer. "People talk about finding a needle in a haystack," he says. "This will be more like finding a needle in 10 million haystacks."
According to Grey, CERN openlab is always researching technologies that might be useful in its lab, including in the area of data management, and deems IBM's SAN FS as "the most promising" technology it has seen so far.
It stands to reason that if SAN FS will prove good enough for CERN to manage hundreds of petabytes of files, it should be more than adequate at managing the files of your average enterprise, most of which today turn to Network Appliance Inc. (NetApp) and EMC for large-scale file storage needs.
Adding cost and complexity
But there are people out there that contend that SAN FS--and distributed file systems in general--add unnecessary cost and complexity to the business of storing and serving files. Whereas NAS systems are renowned for their reliability and ease of installation, distributed file systems are perceived as hard to implement, hard to keep running and for all but the biggest companies -- hard to pay for.
Randy Kerns, a senior analyst at the Evaluator Group, says one company he works with considered deploying a SAN file system, but dropped the project. The company had 80 or so servers that it was hoping to hook into the system. They calculated that after scheduling the necessary downtime to install the client-side code, installation alone would take about a year.
To be fair, one of SAN FS' central design points is its ability to coexist on an application server with another file system. IBM also provides data migration tools that can move data out of one file system and into SAN FS, on a per application basis, says Truskowski. "It's not an all-or-nothing proposition--you don't have to commit to do everything at once," he says, adding: "that wouldn't work for our enterprise customers."
But once the clustered file system is installed, are your troubles over? Probably not, at least in SAN FS' initial release. SAN file systems have the reputation, deserved or otherwise, of being finicky. "I won't kid you, this stuff is hard to do," says Paul Rutherford, vice president of technology at ADIC. "Especially when you start to add heterogeneous clients."
At SGI--which added support for non-SGI IRIX clients recently--Gabriel Broner, senior vice president and general manager of the storage and software group admits that "we had some rough times" ironing out compatibility issues, although he is now confident of CXFS' stability in a heterogeneous environment.
But Steve Kenniston, a technology analyst from the Enterprise Storage Group, argues that IBM is fully aware of these issues, and as "a trusted brand," is "at more of a risk if it doesn't work than if it were just a startup." In other words, IBM has a big incentive to ensure that everything works as advertised. And overall, Kenniston says, he was "pretty impressed" with the way IBM "took a look at all the major issues" including file-locking, multipathing, cache and timeouts.
And no one can accuse IBM of biting off more than it can chew in its first release. With release 1.0 limiting its support to IBM and Windows environments, initial SAN FS solutions will most likely be sold largely as bundled solutions, probably as part of a push to get customers to consolidate on Shark, Kenniston suggests. And make no mistake--this stuff won't come cheap, with many installations coming in at "a million dollars and north of that," says Kenniston.
Assume SAN FS installs without a hitch and runs perfectly. The question then becomes: Should you consider SAN FS, especially if you're not in the "lunatic market" as ADIC's Rutherford affectionately terms the technical computing markets that have traditionally deployed distributed file systems?
The one problem that SAN FS--and all distributed file systems--really do fix is the eternal "I love my first filer, I hate my tenth" problem. That cliche, usually applied to Network Appliance filers, arose because contemporary 32-bit file systems (and the NAS systems that are built on top of them) tend to top out around 2TB, explains IBM's Truskowski. And while that may have been a lot of capacity five years ago, these days, it's a drop in the bucket.
Sixty-four-bit file system technology, combined with a global namespace, can free users from these limitations. SGI's CXFS, for example, has a theoretical capacity limit of 18 exabytes--that's 18 million terabytes--although realistically, CXFS customers top out at capacities of hundreds of terabytes, says SGI's Broner.
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
Another way to manage a lot of files |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
|
With the rights to license over 70 million images, and more coming every day, Corbis in Seattle, WA, is the poster child of users with large shared file requirements.
And while the company did consider clustered file system technology, reports Alex Taylor, manager of systems engineering, they ultimately decided against it, because while "it would have helped with the administration, it wouldn't have helped with the engineering." He adds: "There's a lot of thinking that goes into a SAN."
After grappling with about 25TBs of SAN storage, this winter Corbis decided to change course, and purchased 10TBs of storage from NAS startup Isilon Systems, on to which it gradually moved its image data. The group subsequently purchased another 13TBs, for 23TBs total.
For administrators, the change has been dramatic. Whereas it used to take between three and four weeks to provision new SAN storage, provisioning Isilon storage only takes two days. That's from the moment he places a call to Isilon, to the moment the storage is available.
As a result, Corbis has been able to reclaim its SAN "for what it was designed to do: SQL databases." |
 |
|
 |
 |
 |
 |
 |
 |
 |
Is NAS a better alternative?
In the world of storage as elsewhere, there are many different ways to skin a cat. And if it's file sharing that's your trouble, there are a lot of people out there that will argue that NAS is still your best bet.
That may be especially true these days, with several new technologies on tap that address the limitations of traditional NAS systems. Startups such as NuView Inc., Houston, TX, and Z-Force, Santa Clara, CA, for example, both offer software that can aggregate a NAS environment into one coherent view. And then there are NAS startups such as Isilon Systems Inc., Seattle, WA and Spinnaker Networks, Philadelphia, PA, run their proprietary NAS boxes on top of 64-bit clustered file systems. In Spinnaker's case, users can cluster 512 of its NAS boxes together in a single 11PB file system.
IBM's Truskowski, however, is dismissive of NAS for demanding environments. "The performance is lousy," he says. Assuming Gigabit Ethernet as your connection, you're moving data half the speed of today's 2Gb/s FC networks. "There's definitely a place for NAS," he says, "but trying to solve these sorts of problems with it is like sending out a corporal to do a general's job."
But for some industry observers, that's all bluff and bluster designed to support a technology that has a questionable role to play in the average enterprise data center. "There's no question that if Almaden [the IBM research lab where Storage Tank was developed] had come up with the idea for Storage Tank today rather than five years ago, it never would have been built," says one industry observer who asked not to be named. But as it stands, "there are people whose entire careers depend on it," he says. "They're going to shine this pig up and bring it out to the party." For this source, who is a veteran of the distributed file system world, the cost and complexity that SAN FS brings to an environment far outweigh its purported benefits.
ESG's Kenniston meanwhile, has a far rosier view of SAN FS' future--but that future is much farther out than IBM may have you believe. "If you look at the future, it's all about Linux-racked servers, morphed with some sort of [Veritas blade management software] Jareva, running Oracle 9i RAC, on top of a clustered file system. That's the future." But he adds, "That's another two to four years away."
When it comes to blade computing, Joaquin Ruiz, vice president of marketing at Sistina Software Inc., Minneapolis, MN, agrees with Kenniston that clustered file systems are critical to the evolution of blade computing. "In the '90s, the main reason for using clustered file systems was high availability, not performance," he says. But with the advent of blades, "you no longer have just one or two monolithic servers connected in the SAN--you have dozens," and the clustered file can act as "the control point" that lets you manage it all.
However, Sistina, like its competitor PolyServe Inc. in Beaverton, OR, eschews SAN FS' asymetric out-of-band metadata architecture, instead choosing to distribute metadata symmetrically on all the nodes in the cluster. That approach, says Michael Callahan, PolyServe founder and CTO, is similar to the one used in "classical VMS clusters," from now-defunct Digital Equipment Corporation, and is much better suited to the high-performance transactional applications that many companies envision running on their blade servers. If other clustered file systems that rely on an out-of-band asymetrical architecture are any indication, he says, it is questionable whether IBM's SAN FS will be able to deliver the performance and high availability requirements put forth by blade applications such as parallel databases.
iSCSI and Ethernet are two other factors that could figure into when and how clustered file systems get deployed. Says Paul Ross, director of storage network marketing at EMC: "Go out three years--Ethernet's at 10 Gig, and more people are running iSCSI than not. At that point, SAN file systems start to make a lot of sense because you don't have to deploy a separate infrastructure to run them. The SAN file system becomes a LAN file system."
But what about today? Is SAN FS the answer to storage managers' file storage needs? In large environments, quite possibly, says ESG's Kenniston. But he cautions, "In the IT world, there are the haves and the have-nots. This is for the haves."