Beyond the file system
A fully clustered storage system goes beyond what the servers and applications see; it provides the underpinnings and infrastructure of the storage system itself. Among available products, the best examples are those that have been built from the ground up to deliver clustered storage. These hardware-based systems address the scalability of physical resources, not just that of the file system. According to Kerns, these systems have an advantage over some of the software-only approaches to clustering. "You're going to put on another layer of software and yet you're still probably going to manage those devices independently," Kerns said.
While most midrange storage systems offer a modular approach to growing capacity, clustered systems take the concept a step further. Typically, in a non-clustered midrange array, a module (or expansion unit) is added to increase disk capacity; in some cases, another controller can be added to increase the horsepower of the array. For the most part, these modular midrange arrays can scale capacity, but not performance. "If you're just adding disk, but aren't doing anything about performance," said Kerns, "obviously you'll see some degradation."
In a clustered storage architecture, modules are typically packages that include not only additional disks, but a controller assembly with its own set of interfaces. Building out a clustered array also increases performance and connectivity. Because a full complement of processors, memory, ports and so forth is added with each set of new disks, the performance of a clustered storage system will often scale linearly as it expands. This is in stark contrast to non-clustered modular systems where performance is likely to suffer as disk expansion units are added.
When a module is added to the cluster, the other members of the cluster automatically recognize the new module. The cluster then reorganizes itself to accommodate the added capacity by re-striping data across all disks, sharing data management policies and balancing the workload among all members. Usually, cluster modules interconnect with each other using a Fibre Channel (FC) or Gigabit Ethernet (GbE) interface, although Isilon recently announced it will offer clustered storage systems that use InfiniBand connections, which are approximately 10 times faster than GbE.
Servers connected to the clustered array are unaffected. Typically, there's no need for client software on the host servers, and they can continue to access storage from the pool even as new capacity is added. Within the storage cluster, the specific controller that a host connects to is almost irrelevant, as cluster modules can hand off responsibility for those interfaces to one another to adjust to failures or varying loads and bandwidth requirements.
For cluster modules to interact effectively, their operating systems must be in constant communication. If a unit fails -- or shows signs of an impending failure -- its processing workload is picked up by other cluster modules and data is transferred from its disks to others, if necessary. This arrangement provides effective failover to ensure availability and, as more modules are added, data protection and availability increases as well.
Most importantly, as modules are added to accommodate new requirements, administration remains constant. Even as the cluster grows, "I can administer it as a single system and don't have to change anything," Kerns said. "I don't have to administer another box."
Sports Illustrated in New York City opted for clustered storage to support its onsite digital photography operations. Phil Jache, deputy director of technology for the magazine, said their three Isilon IQ arrays have been air-shipped to the Olympics, Super Bowl and other major events. The Isilon systems cut two or three hours from the magazine's photo processing time. "It enabled us to do some things that just weren't possible [before]," Jache said.
AccessIT installed a Xiotech Magnitude 3D clustered storage system at its Managed Services Division in New York City and another at its Media Services headquarters in Los Angeles. Erik Levitt, president and COO of the Managed Services Division, said AccessIT installed one of the Xiotech boxes to support its IT services business, which supports clients in 35 countries from 10 data centers. The Los Angeles-based Xiotech system is used primarily for the distribution of digital films, such as I, Robot and Shark Tale, to nearly 30 movie theaters equipped with digital projection systems.
The Xiotech cluster in New York replaced a traditional monolithic SAN. Levitt said the price of the Magnitudes was a major selling point. "We're adding about 5TB at a clip, so scalability is extremely important to us," Levitt said.