Local storage worked fine for the Sacramento, Calif. Superior Court until it decided to go paperless a few years ago. Bracing for the flood of documents and files that would need to be scanned, stored and protected, the court's IT organization purchased network-attached storage (NAS) to help cope with a load that would escalate from 100 GB to 1.7 TB, with much more on the horizon.
That's a classic scenario for a
Many users need to share the same documents or files. The IT department wants to control access, speed file serving and improve its ability to back up, protect and recover the data. So IT pulls storage out of the servers and consolidates it into a dedicated file storage system that can simplify management.
A NAS array combines a file system and disk storage system over a network. A controller acts as a traffic cop, running the file system, determining which data goes where and managing the disks and the cache. The file system enables many of the important features, such as snapshots for recovering files and a security model that allows different attributes, to be applied to different files on the same volume.
NetApp, which pioneered the NAS product category about 15 years ago, calls its proprietary file system Write Anywhere File Layout (WAFL). "NetApp will tell you that its magic sauce is in WAFL," says Arun Taneja, founder and president of the Taneja Group, a consulting group focused on storage technologies.
How NAS arrays differ from SAN arrays
One of the distinguishing characteristics of a NAS array is its use of file-based protocols: NFS for Unix- and Linux-based systems, and CIFS in Windows environments (although Windows supports an NFS interface, too). They also support HTTP for Internet-based information and FTP for large files.
The original use of NAS systems was for traditional file serving.
VP and GM of the NAS business unit, NetApp
That's a major point of differentiation from a storage area network (SAN). SANs have no file system and support only block-level access to data through protocols, such as Fibre Channel or iSCSI.
"The original use of NAS systems was for traditional file serving," says Brendon Howe, vice president and general manager of the NAS business unit at NetApp. That meant NAS was often tagged as "lower performance, not fully secure, not mission-critical" compared to a Fibre Channel SAN.
But things have changed a great deal. "There's a much broader use of the NAS protocols, in particular NFS, for running production databases like Oracle or applications like SAP, which are a big part of NetApp's business today. We also have some of the largest companies in the world running VMware [virtualization] installations over an NFS-based infrastructure," Howe says.
While NetApp takes on more mission-critical work, the lines grow increasingly blurry between NAS and SAN. Most NAS arrays now support iSCSI for block-based access to data, a logical next step since NAS protocols and iSCSI run on Ethernet.
"A lot of times, you don't even have to pay extra for it," Taneja says. "That option is free, and a customer decides how much traffic comes in via iSCSI and how much comes in via NFS/CIFS or on HTTP or FTP."
NetApp and EMC, the leading vendors, also support Fibre Channel on multiprotocol arrays, or unified storage devices, that combine block- and filed-based storage through a single controller that can manage both.
Such a multiprotocol array was a high priority for the Sacramento Superior Court, because it wanted a product that could handle its IBM FileNet documents, as well as the block-level needs of its servers running Microsoft Exchange and SQL Server. The state court uses iSCSI for those mail and database servers and is in the process of upgrading to Fibre Channel, according to Lewis Walker, a senior IT analyst at the state court.
"The converged systems do a good job of giving you all the options you need for departmental file storage, application-generating files, archiving, Oracle databases and the Exchange platform," says Andrew Reichman, an analyst at Forrester Research. Converged systems, he notes, are especially effective "if you don't have really extreme performance needs from any one of those workloads, and you just want something that's good enough and cost effective and available for everything."
Companies with 20 TB of data or less that want a basic, easy-to-manage system are well served by converged systems at the low end from vendors, such as Dell and Hewlett-Packard; medium-sized companies in the 20 TB to 50 TB range are candidates for higher end converged systems from NetApp and EMC, Reichman says. Large companies with hundreds of terabytes to 1 PB, however, need multiple boxes.
Vendors enter NAS space from all levels
NetApp and EMC entered the NAS space from different directions: EMC moving down from the high end and NetApp growing up to the enterprise. Now they both play at the entry level, midrange and high end. Both also sell NAS gateways with no storage, giving users the option to buy storage separately or use storage they already have in-house. But, it's more common for customers to buy a complete array with the NAS head and storage tightly integrated.
Beyond EMC and NetApp, many of the low-end and midrange products from vendors, such as Dell and HP, run Microsoft's Windows Storage Server technology. IBM, which uses NetApp technology, is also a top competitor. BlueArc is one of the vendors specializing in high-end NAS.
Capacity, performance and features differentiate the entry-level products from the midrange and high-end NAS arrays. But products can be tricky to compare, since vendors classify their offerings so differently. EMC's Brad Bunce, director of product marketing for NAS platforms, views the company's low end at 100 TB of storage capacity or less, its midrange at 500 TB or less and its high end as anything more that 500 TB.
By contrast, some low-end arrays might offer no more than 15 TB, two to 24 disk drives, 1 GB or 2 GB of RAM and one or two 1 Gigabit Ethernet (GigE) ports. Midrange systems might peak at 100 TB, 100 drives or 200 drives, 4 GB of RAM and a half dozen 1 GigE or 10 GigE ports.
Some entry-level products don't offer the snapshot, replication and mirroring capabilities that are staples of the higher end systems. A midrange NAS array might permit a few hundred snapshots, whereas a high-end system could allow more than 10,000 snapshots. The numbers generally soar exponentially at the high end, especially as clustered NAS enters the picture.
"Look beyond the speeds and feeds," advises Greg Schulz, founder of the StorageIO Group. "Don't get hung up on the number of ports, the number of drives, the number of controllers or processors, the amount of memory.
"Instead," Schulz suggests, "look at the effective performance. What is the useful capacity? What is the availability? What levels of redundancies? How many snapshots do I get for free? What features do I have to pay for? Is replication local or long distance? How big a file can I put into a file system? This is where you start to get into the details that line up with how you're going to use this."
This was first published in September 2008