This content is part of the Essential Guide: The ultimate network-attached storage guide
Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Clustered NAS: How to choose the right clustered NAS system

Finding the right clustered NAS (or scale-out NAS) system means knowing your storage needs. Learn the questions to ask when evaluating your requirements and the latest offerings.

What you'll learn:  Data storage managers must have a clear picture of their unstructured data and file-system environment before investing in a clustered NAS system. We'll give you a list of questions to ask when evaluating your storage requirements and the latest clustered NAS offerings.

Clustered network-attached storage (clustered NAS) uses a distributed file system that runs concurrently on multiple nodes or servers. Unlike traditional NAS, clustered NAS stripes data and metadata across storage nodes and subsystems. Clustering also provides access to all files from any of the clustered nodes regardless of the physical location of the file. But how do you determine which clustered NAS system is right for you? Here are the questions you need to ask when evaluating your own data storage requirements along with the latest offerings from vendors.

What is the maximum number of file systems and file objects the system can handle?

NAS has an issue with how many file objects it can manage under one system, according to Marc Staimer, president at Dragon Slayer Consulting. When that limit of file objects is reached, the system tends to shut down without warning, he explained. "It's a real pain to take data off of the system to make it functional enough to start migrating data to another system. So it's not a capacity issue, it's a file object issue more often than not."

Does the vendor's file system span multiple nodes?

If the answer is yes, can the system write concurrently across all of the nodes? If each node can't write data concurrently across every node, you may run into I/O shipping problems where write requests have to be "shipped" to the data's file-system master node. That could kill expected performance gains.

What's the I/O profile of your applications?

Terri McClure, a senior analyst at Enterprise Strategy Group, said this is important because some scalable NAS systems are optimized for throughput rather than I/O, and don't handle a lot of small I/O requests.

What types of data will sit on your clustered NAS system?

Systems such as BlueArc Corp.'s BlueArc Titan and Mercury, Hewlett-Packard (HP) Co.'s HP X9000 Network Storage System (NSS) and Isilon Systems Inc.'s X-Series are good fits if you plan to store large sequential files such as rich media and video. However, these systems may not fit as well for mainstream enterprise file serving.

Do you just need a NAS accelerator?

If you only need capacity for unstructured data, you might not need a clustered NAS system at all, according to Greg Schulz, founder and senior analyst at storage industry consulting firm StorageIO Group. Instead, there are single-node NAS systems that have plenty of capacity. If you just need better performance, the same consideration applies. It's possible you might need a NAS accelerator instead. Arun Taneja, founder and consulting analyst at Taneja Group, said scalable NAS systems are built for high utilization rates; once you have your system, you should be using a significant amount of the system's capacity. "If you're operating at anything less than 80% storage utilization, you're not using that clustered NAS box correctly," Taneja said. "Or the clustered NAS box is not very good and you should throw it out."

Clustered NAS or scale-out NAS architectures vary among data storage vendors

Nearly all major storage vendors have some kind of clustered NAS systems by now, but many of them are comparable in name only. That's because these systems -- also known as scale-out NAS -- were built with widely different architectures in an attempt to solve unstructured data performance, capacity and availability issues.

"Right now, the industry seems to think clustered NAS is just clustered NAS," Taneja said. "But just look at the architectural differences; they're not even apples and oranges. It's like apples and beer." Consider three of the industry's biggest names: EMC Corp., Hewlett-Packard (HP) Co. and NetApp Inc.

EMC is an early NAS vendor that arrived relatively late to the clustered NAS scene. Its current clustered NAS system is the Celerra NS-960 with multi-path file system (MPFS). The Celerra NS-960 is part of EMC's unified storage platform, and scales to eight blades and 960 drives. MPFS separates the file and data paths so requests can be fulfilled using either NFS or iSCSI, depending on the data requested.

HP acquired two scalable NAS startups in recent years. Its HP X9000 NSS is based on technology from its 2009 acquisition of Ibrix Inc. The Ibrix technology is best suited for high-performance computing (HPC) where customers need access to large number of files from a single repository, such as media, entertainment and life sciences. HP also acquired PolyServe in February 2007 and turned that clustered file system software technology into its 9100 Extreme Data Storage (ExDS9100) system, which is best suited for high I/O environments such as transactional databases.

NetApp uses its Data Ontap 8 for scale-out NAS. Ontap 8 combines the vendor's Data Ontap 7G and GX platforms. Taneja predicted Ontap 8 won't see wide acceptance until NetApp upgrades the product to accommodate more than two nodes and integrates its SnapManager software. Ontap 8.1 is scheduled for release during the second quarter of 2011. NetApp acquired the technology for its clustered NAS by buying Spinnaker in 2003.

Dell Inc. and IBM are two other large vendors with new or anticipated scalable NAS systems. Dell scooped up Exanet Inc.'s assets earlier this year, and is using that technology to develop its own clustered NAS product. IBM introduced its Scale Out Network Attached Storage (IBM SONAS) in February. SONAS uses IBM's General Parallel File System (GPFS), includes up to 30 System x3650 nodes and allows up to 256 snapshots per node.

From a technology standpoint, these major vendors are trying to catch up with smaller companies that started as clustered NAS vendors and have been selling scalable systems for years. They include BlueArc, whose Titan 3000 series can scale up to 4 PB within a single namespace and supports NFS and CIFS (Hitachi Data Systems sells BlueArc Titan and Mercury platforms through an OEM deal); and Isilon Systems Inc.

Other scalable NAS offerings include Panasas Inc.'s ActiveStor modular clustered system used primarily in the HPC industry, and Symantec Corp.'s Veritas Storage Foundation Scalable File Server, an appliance based on Veritas Cluster File System.

Dig Deeper on NAS devices