Clustering is a fairly new phenomenon in the storage world, at least in terms of wide-scale deployments. The growth of data has seen leading-edge large shops -- oil and gas surveyors, huge Web sites like Amazon.com and MySpace, and shops with an unusually high I/O demand, like Pixar's movie studio -- looking to clustering as a way to improve performance.
Clustering is essentially a spinoff from NAS -- a way to split I/O and storage up in different combinations in order to avoid having to buy multiple NAS filers with separate file systems and disk back ends. The name that normally comes up when discussing the limitations of buying box after box to scale out a NAS system is usually Network Appliance Inc. (NetApp). NetApp, for its part, has been promising a cluster system based on IP it acquired from Spinnaker Networks, but so far hasn't delivered.
Meanwhile, startups are rushing in to provide clustered file systems. The major clustering players are Ibrix Inc., PolyServe Inc., Isilon Systems Inc., Exanet Inc. and Panasas Inc. Most recently, a company called Crosswalk Inc. -- heretofore a storage resource management software player -- has recently come onto the scene with a clustering product called iGrid, which, of course, it claims is vastly superior to established clustering players. Crosswalk warrants perhaps a little more attention than is normally paid to a fledgling startup since its founder, Jack McDonnell, was the founder and former CEO of McData Corp. Crosswalk has a lot of backing in industry experience and funding.
The short answer is -- yes, there are some differences. Are they as glaring as iGrid makes them out to be? No.
In a way, iGrid fits somewhere in between Ibrix's system and Isilon's. Each Isilon system node comes with its own disk attached to a NAS head. Isilon breaks up the file system across the NAS heads and even breaks individual files up across the disk nodes. Ibrix, on the other hand, sits on commodity servers in front of back-end storage that the customer can choose on its own, but if Ibrix's recent installations are any indication, that back-end is usually EMC Corp.'s Clariion arrays, and the server hardware is often from Dell, Inc. iGrid, meanwhile, does not come with disk -- making it more like Ibrix. But like Isilon, it doesn't assign one portion of the file system to one node in the cluster.
At least theoretically, iGrid's marketers have a point about Ibrix -- if portions of the file system reside on one server or another, there's the chance that "hot spots" could develop if multiple end users are pounding away at the same file or data block.
But Ibrix has a response to this -- the number of servers that can be added to the cluster layer is almost unlimited. Even if a "hot spot" were to develop, according to Ibrix's director of marketing Joe DeRosa, the user could simply tack on another node, spread the file system out a little further and the problem would be solved.
As far as Isilon is concerned, since files and the file system are broken up into so many tiny pieces across the nodes (and more nodes can be added without the administrator having to manually rebalance the file system), the point is probably moot there as well.
There is also a theoretical point, like the one about Ibrix, where hammering on the same file could create a "hot spot" not on the file system but on the disk itself that's included in Isilon's system. But that's where Isilon's memory cache comes in, according to Isilon's vice president of marketing Brett Goodwin.
"For example, on a 10-node cluster, you'd have 4 GB of front-end memory read cache on each node, and the nodes aggregate that across the whole system. So in a 10-node system, you'd have 40 GB of front-end read cache available. There wouldn't be a hot spot on the disk because the system wouldn't have to go back to the disk every time to get it."
And Goodwin fired back a point of his own about iGrid -- while Crosswalk claims that the system's lack of included disk provides more flexibility, Goodwin argued that it could also add complexity.
"Customers like to have a complete system," he said. "Many of the cluster systems that don't include disk from the beginning end up partnering with hardware vendors anyway." (This is a reference to Ibrix's Dell/EMC connections.)
And it is also true that for Crosswalk's part, a hardware partnership is on the horizon. McDonnell hinted this week that it was in talks with a major vendor to use its hardware for the back-end storage on the cluster system but declined to give further details.
Meanwhile, Panasas, which sells a Linux-based parallel file system product, seemed to agree with Crosswalk's claims that the other cluster players have "bottlenecks." Its solution to this problem is for cluster nodes to run a SAN file system rather than NFS, eliminating the "middleman" of splitting a file system among NAS heads. This approach, called a parallel file system, is one iGrid also claims to have -- but Panasas points out, and rightly so, that it was there first.
In the end, it seems the competition will come down,as usual, to business politics and good marketing, rather than any truly glaring technological differentiators between the products. And so far, both sides are sanguine that they have the best chance.
Both Ibrix and Isilon flexed their muscles when talking about Crosswalk -- both DeRosa and Goodwin made sure to name-drop their flagship customers, including Pixar and MySpace, respectively.
"The road between having a concept and having hundreds of customers with your system in production is a long one," Goodwin snarked. "I've seen a lot of players come and make really bold claims, but I can tell you we haven't seen the iGrid in any customer engagements so far. Not one."
Crosswalk did put SearchStorage.com in touch with Bill Ward, CEO of Front Range Internet, a small ISP in Colorado that has signed on as a beta tester for the iGrid. Ward said he expected the iGrid cluster to let him add managed storage services for large customers, including one that would add 50 to 100 terabytes of storage on his systems by itself.
However, Ward did admit that so far the limit on Crosswalk's nodes was eight -- though Crosswalk has promised 256 in the near future and anticipates that the grid will scale as far as the other players'. It just, as Goodwin rightly pointed out, hasn't had occasion to yet.
Moreover, while Ward said iGrid had replaced NetApp boxes in his environment, and that he had evaluated Ibrix, Isilon and others, he couldn't say what the technical differentiators between iGrid and the other grid products were. Mostly, he said, he valued "getting in on the ground floor" with iGrid and Crosswalk and "having some input into the development of the product."
Which is all well and good, but the vendors are going to try to tell new cluster users in the coming months that there are strong technological contrasts between competing products. After all, that's their job. But their prospective customers shouldn't be fooled -- should keep their eyes open, like Ward, and go with the company they feel most comfortable with.