Using clustering technology to scale the performance, capacity, connectivity and availability of servers isn't new. Clustered storage, however, is another matter.
While clustered storage is often associated with high performance computing, the reality is that mainstream commercial environments are adopting clustered storage at a rapid rate. These businesses are attracted by the way clustered storage now leverages established technologies such as Ethernet, Fibre Channel and InfiniBand protocols, by its reliance on open access methods such as NFS and Windows CIFS, and by its use of industry-standard servers and third-party storage.
The clustered storage solutions enjoying the highest growth rates may be network attached storage (NAS) file servers. Deployments of this technology are being driven by the need of organizations to scale beyond the limits of a single storage box to handle structured and unstructured data.
- scaling in performance of large sequential bandwidth (throughput) or small random IOPS (transactional) and meta data lookup;
- scaling in storage capacity;
- scaling availability on a local or distributed basis to isolate against device or site failure;
- scaling of flexibility, including concurrent access of the same or different data along with parallel access of data for different application needs;
- scaling in terms of offering modular (pay-as-you-grow) storage growth; and
- scaling in ease of manageability of tasks such as provisioning of storage, load balancing and data protection.
Approaches to NAS and file serving clustering
The technologies that most companies are clustering are storage, file systems and file servers. Clustering adds standby or failover capabilities to storage systems that in turn support scaling with a large number of controllers, storage nodes or processors along with clustered file systems. One reason for the confusion in discussions of clustered storage is that there are block-based (iSCSI and Fibre Channel) and file-based (NAS NFS and CIFS) storage, virtual tape libraries and other types of clustered storage solutions.
Clustered file systems enable administrators to access a common pool of storage across application servers. Clustered file systems also permit shared access (read and write) of data files, which is useful for maintaining data consistency and integrity whether using direct-attached or networked storage. Examples of clustered file systems are SGI CXFS, Quantum StorNext, Red Hat GFS and IBM SFS and GPFS. Not all clustered NAS boxes have a clustered file system and not all clustered file systems rely on clustered NAS servers. Some systems (for example, IBRIX Fusion) combine both.
What differentiates a clustered fileserver from a traditional NAS file server or clustered storage system is the way hardware and software is combined. A clustered file system can be installed on application servers or on dedicated appliances or servers, transforming them into storage servers (essentially, becoming a clustered fileserver). Some clustered fileservers, such as HP PolyServe or IBRIX Fusion, are hybrids that enable a clustered and/or parallel file system deployed on industry-standard servers.
Some vendors who have dual or redundant storage controllers, storage engines, NAS heads or gateways using active/active (both controllers working) or active/passive (one controller in standby) modes claim to offer clustered storage systems. All I can say is that if you consider a pair of storage processors or controllers as a cluster, you'd have to consider every storage system with at least two nodes a cluster. . .which would encompass pretty much all of the mid-range SAN, DAS and NAS storage systems in the marketplace.
There are many more vendors providing clustered NAS storage (in other words, beyond basic failover) and, more importantly, clustered fileservers. NAS, by its nature, is a file serving solution that sits on top of hardware and in some cases has the ability to transform the hardware into a clustered fileserver. Examples of NAS hardware/software solutions that also support clustering of file systems and the underlying hardware include NetApp GX, BlueArc Titan, and offerings from Isilon and Panasas.
Isilon and Panasas use proprietary processors and storage. BlueArc uses optimized processors that attach to and share access to underlying RAID-equipped storage from several vendors. Other examples leverage clustered file system software installed on industry standard servers transforming them into storage servers such HP PolyServe and IBRIX Fusion among others.
About the author: Greg Schulz is founder and senior analyst with the IT infrastructure
analyst and consulting firm StorageIO Group. He is also the author of the definitive book on
storage networking, Resilient Storage Networks, published by Elsevier, and is a regular
contributor to Storage magazine and other TechTarget venues.
This was first published in March 2008