Among the myriad challenges storage managers face, some of the biggest are: how do you grow the system; how do...
you make sure your data is available and how do you deliver adequate performance--all for a reasonable price.
Inspired by the obvious shortcomings of existing storage systems, as well as the availability of low-cost high-speed interconnect technology, some storage vendors--startups as well as some more established players--are building storage systems around clustering technology. These include IP SAN pioneers EqualLogic and LeftHand Networks; Isilon, which is developing storage for the rich media and content creation market; 3PAR, whose InServ Storage Server uses clustering between internal nodes and Xiotech, which announced the clustered 3D version of its Magnitude array.
What can clustering bring to a storage system? "If you look at clustering generically, it's used to solve two very different problems: redundancy or scaling," otherwise known as availability and performance, says Peter Hayden, EqualLogic, Nashua, NH, founder and CEO.
At Xiotech, Eden Prairie, MN, clustering was enlisted largely to bring added resiliency to its existing Magnitude array, which, while much loved by users, suffered from the fact that it had no failover capabilities. Using RAID 10 or 50 to stripe and mirror data across all available spindles in the cluster, up to half of the disk drives and all but one of the node controllers in a Magnitude 3D cluster can fail, and you can still have access to your data, says Rob Peglar, Xiotech chief architect.
"We know that components are going to fail--there's no question about it. The question is: How do you architect the system to provide maximum resilience?" Peglar says.
The Magnitude 3D today can only consist of two nodes, but Xiotech plans to increase that number to 16, as well as provide data replication across geographically dispersed clusters.
At 3PAR, Menlo Park, CA, clustering happens between up to eight nodes under the hood of a single chassis, with access to a single write coherent cache. That provides what Geoff Hough, 3PAR director of product marketing, calls "a single block storage system," which he defines as "a platform whose volumes [LUNs] are accessible through any or all of its host ports at any given time." Ultimately, the clustered InServ architecture provides "the cache coherency of a monolithic array," failover between nodes, plus the "cost-effective components" associated with modular storage.
But according to Boulder, CO-based LeftHand Networks' CTO, John Spiers, "another benefit of clustering is management since it aggregates the systems."Certainly, management was the feature that sold the restaurant chain Noodles & Company, Boulder, CO, on LeftHand's IP SAN, which consists of clustered network storage modules, or NSMs. With an IT staff of four, Noodles & Company quickly ruled out Fibre Channel SAN solutions as "overkill," and set out to find "a highly available, fault-tolerant solution that would allow us to consolidate out storage, and still be easy to manage,"says Nick Fields, Noodles & Company systems engineer. The company installed a cluster of four NSMs, with connections into the company's Exchange and SQL databases. The whole cluster takes "less than a half an hour per week to manage," estimates Fields, who hopes to replicate the cluster to a co-location facility next year.
Noodles & Company is all set for capacity right now, but should they need it, they will have plenty of room for expansion: a LeftHand Networks IP SAN cluster can contain a theoretical maximum of 264-1 NSMs. More practically speaking, "we are limited by the network bandwidth," says Tom Major, LeftHand vice president of marketing, and have customer deployments with 16 nodes.
So is clustering the new end-all and be-all storage architecture? In the server world at least, clustering has never really taken off beyond high-availability applications.
Why is that? One reason, says EqualLogic's Hayden, is that designers sometimes ask too much of clustering. Trying to solve both the scalability and availability problem with clustering "is where people have gotten in to trouble," he says.
Take clustering for performance, for example. "Given a four-node server cluster, you might only get two times the performance," Hayden says, "because of the overhead incurred by the nodes communicating with one another. Performance is not additive." EqualLogic avoids this pitfall, he explains, thanks to a loosely coupled clustering architecture derived from networking--not computing--principles. It advertises that its PeerStorage Array 100E can be linked via iSCSI into a peer group of up to 25 nodes, for approximately 100TB of capacity. Furthermore, "a data request in a PeerStorage cluster never involves more than two members," he says, "so the overhead is always the same."