This article can also be found in the Premium Editorial Download "Storage magazine: How to scale up with storage clusters."
Download it now to read this article plus other related content.
|Traditional vs. clustered NAS storage|
Storage systems typically become siloed as more capacity is required. Multiple connections are required to allow hosts to access all installed storage. In a clustered storage system, the storage controllers communicate with each other internally and present a single file system to the hosts. Multiplatform hosts can connect to the cluster through a single connection to a switch which, in turn, is attached to the cluster.
Clustering has improved the reliability, availability and manageability of data center servers while allowing bundles of inexpensive configurations like blades to replace costly, monolithic servers. The benefits of server clustering haven't escaped the notice of the storage industry, but clustering storage involves challenges other than just tying servers together. Vendors have taken diverse paths to address those challenges, but they fall into two main categories: clustered file systems and standalone hardware with a clustered architecture.
"With traditional midrange storage systems, you can quickly run out of hardware resources," says Tony Asaro, senior analyst at the Enterprise Strategy Group (ESG), Milford, MA. When more capacity or horsepower is needed, traditional systems offer few alternatives other than installing another storage device with all of its associated costs.
Implementing a clustered storage system doesn't require clustered servers. While the technologies are quite similar, they aren't interdependent.
The growing popularity of clustered storage has also spawned the usual industry buzzword mania. Storage vendors of all stripes are touting their hardware and software products as clustering technologies--products that may be implemented at nearly any point in a storage environment. While spiels tend toward hyperbole, most of these products are clustering applications, although many are point, rather than total, solutions.
Vendors have turned toward clustering technologies to address the four big issues facing most storage managers. These design goals aren't the exclusive province of clustering--nearly all storage systems strive for these--but they're the fundamental goals of clustered systems:
- Capacity scaling. Additional storage capacity should be easy to add in a non-disruptive manner.
- Performance scaling. As capacity is added and the number of supported hosts grows, performance should scale sufficiently to maintain an acceptable service level.
- Availability. Redundant components and transparent failover should ensure data is always available.
- Manageability. Scaling, failover and capacity management should be as automated as possible.
These goals may be achieved in a variety of ways, but there are some basic precepts of clustered storage. For example, clustered systems pool their storage and present it as a single image to hosts as a global file system that's often referred to as "a single drive letter." This makes better use of available capacity while easing storage management. It also enhances the ability of hosts to share data while avoiding multiple instances of the same files (see Traditional vs. clustered NAS storage, this page).
The simplest form of clustering involves two controller units paired so that one would provide failover for the other. In a two-way, active-passive paired configuration, one controller is essentially on standby. Because this scheme doesn't provide for scaling and the passive unit isn't sharing the primary's load, this is often referred to as "pseudo clustering." An active-active controller arrangement is a step up from pseudo clustering, where the two controllers provide failover for each other and share the work.
In a non-distributed, active-active cluster, cluster members share a file system and some other physical resources, but provisioning and LUN assignment for specific controllers are mostly manual chores. The distributed peer cluster is the most common architecture employed by vendors that have designed and built their clustered storage systems from the ground up. In a distributed cluster, physical resources are virtualized, so a storage administrator only needs to deal with how storage is associated with installed servers and the applications they host.
This was first published in April 2005