Arrays score with both file and block storage

Multiprotocol arrays that support block- and file-based storage through a single controller give users the best of both worlds: NAS for file-based information, and Fibre Channel or ISCSI block-based storage for databases and other transactional apps.

There are pros and cons to running multiple protocols through a single array controller.

Multiprotocol arrays that support block- and file-based storage through a single controller give users the best of both worlds: NAS for file-based information, and Fibre Channel (FC) or iSCSI-attached block-based storage for databases and other transactional applications. "NetApp [Network Appliance Inc.], EMC [Corp.] and Pillar Data Systems are seeing an increasing number of their arrays used in multiprotocol deployments," reports Roger Cox, research VP at Gartner Inc.'s offices in San Jose, CA.

Prior to SANs, storage resources were dedicated to a single server. It was only in the mid-1990s that FC SANs enabled the separation of server and storage, making a single storage pool accessible to more than one server. While a huge leap forward, FC SANs were--and mostly still are--complex and expensive, and they left a huge gap for a simpler and less-expensive technology.

NAS filled this void by offering storage services, and incorporating and controlling the file system within the storage system, presenting files and directories to hosts in the NFS and CIFS file-system protocols. A large percentage of installed storage stores files.

File-system protocols like CIFS and NFS treat files as a single entity. For example, when a user opens a file, it's opened in its entirety and then locked to prevent others from modifying it. This is ill-suited for apps where data is accessed at a sub-file level, such as databases where small pieces of information are continuously read and updated within a larger file. Without question, block-based storage protocols like SCSI, iSCSI and FC are superior to the file-based access methods of NAS when it comes to non-file-based data access.

You need to remember that information is stored in blocks--not as files--on the disk subsystem. Data is written in blocks to the disk volume, and block size is determined when the volume is formatted, typically ranging from 512 bytes to 2,048 bytes in size. A file, on the other hand, is a concatenation of data blocks managed by the file system. While the disk subsystem manages data in terms of blocks, the OS and humans manage information in terms of files and directories.

Block-based storage access protocols (SCSI, iSCSI, FC) are indispensable because they're native to storage systems and can deal with all types of data efficiently. NAS lets you do away with standard file servers by incorporating file-system protocols into the storage system, providing simpler management, a higher degree of scalability, and added storage services like snapshots and replication. Here's how multiprotocol arrays differ from single-protocol arrays in terms of performance, price, ease of management and other features.

Multiprotocol Arrays:
Key considerations
Features. In many cases, array features are more important than performance. Application integration, snapshot, replication, backup options, thin provisioning and deduplication are at the top of most users' feature requirement lists.

Flexibility. Storage can be used or repurposed for all types of applications. The risk of a vendor product lock-in is lower than for block-based protocol arrays or NAS-only arrays.

Performance. Arrays with block and file-system protocol support are typically not the arrays of choice for very high-performance applications such as high-end databases and transactional applications.

Price. Arrays with NAS and SAN protocol support are significantly less expensive than two separate arrays; but depending on the vendor, additional protocols may not come free.

Storage management. Multiprotocol array management tools that ship with the array are simpler to use than third-party storage resource management apps, which vary widely regarding what arrays they support.

There are some performance issues with multiprotocol arrays when file- and block-based protocols compete for valuable disk I/Os. Large file access, copying a large number of files or NAS backup jobs may drag down SAN performance, especially in midsized and lower end multiprotocol arrays that are more likely to hit their performance limits.

The issue is aggravated in highly integrated arrays like those from NetApp where NAS and SAN share a single data path. In NetApp filers, file- and block-based protocols are passed through the NetApp Write Anywhere File Layout (WAFL) file-system layer, resulting in a higher degree of dependency between NAS and SAN. WAFL was initially conceived for NAS and, despite a commendable degree of optimization, the WAFL translation overhead results in more CPU cycles for processing block-based protocols than a comparable single-protocol FC array.

Conversely, the gateway/dual-path approach in EMC Celerra or Pillar Axiom arrays don't have a WAFL-equivalent layer through which all data, regardless of protocol, must pass on the way to the disk subsystem. But the risk of NAS impacting SAN performance still exists for all multiprotocol arrays, as both NAS and SAN will eventually access the back-end storage.

To alleviate performance interdependencies, multiprotocol array vendors added quality of service to their arrays that enables users to set priorities and define the number of requests permitted for NAS and SAN. "A user can define a logical volume in our Axiom arrays that allocates 85% of resources to databases and 15% to file access," says a Pillar Data Systems spokesperson.

Practically, arrays that support both block- and file-based protocols aren't the arrays of choice for the performance required by high-end databases and transactional applications.

"If I am looking for the fastest block-based storage system, multiprotocol arrays like the ones from NetApp don't come to mind," says Greg Schulz, founder and senior analyst at StorageIO Group, a consulting firm in Stillwater, MN. "However, if I'm looking for the most feature-rich NFS/CIFS/block-based storage system, NetApp is at the top of the list."

Click here for a comparision chart of
Multiprotocol array choices (PDF).

Is there a difference between multiprotocol and single-protocol arrays with regards to redundancy and failure predictability? Simply put, multiprotocol arrays can't match the performance and reliability of high-end, single-protocol arrays. "There are a number of applications that require a level of availability only provided by high-end SAN arrays like HDS USP [Hitachi Data Systems' Universal Storage Platform] or EMC Symmetrix, mostly because of their ability to predict the performance impact in case a node fails," says Chris Bennett, NetApp's VP of core systems.

This level of high availability can be extended to file-based storage access by front-ending an EMC Symmetrix or HDS USP with a NAS gateway; for example EMC's Celerra NAS gateway can be paired with a Symmetrix. Not having its own NAS gateway, HDS offers a NAS blade for its TagmaStore USP that enables access to USP storage via file-system protocols.

To increase the availability in arrays that support file- and block-based protocols through a single controller, nodes are often clustered; if a node fails, another node takes over its workload. Clusters are available in active-active and active-passive configurations. With an active-active cluster configuration, it's crucial to design the cluster to cope with the reduced processing power in case of a node failure. The cluster configuration may also impact the amount of usable storage, a fact painfully experienced by Chalkley Matlack, senior network administrator at doeLegal LLC in Wilmington, DE, who replaced his EMC Celerra NS500 array with a Reldata Inc. storage system. "We weren't able to move the NS500 beyond 16TB without switching from an active-active to an active-passive configuration, giving up redundancy," laments Matlack.

Cluster options are available at the higher end of the performance and price product spectrum from such companies as EMC, Microsoft Corp., NetApp and Pillar Data Systems. Multiprotocol array clusters are the way to go in environments with 24/7/365 uptime requirements. "Availability was our primary concern when we chose NetApp 3000 series clustered arrays for all of our locations," explains Michael Israel, senior VP of information services at Six Flags in New York City.

Storage management
Managing the storage of multiprotocol arrays is more challenging than managing FC-only arrays, mostly due to the limited support of NAS protocols by storage management apps and storage standards. "SMI-S has been targeting the SAN market and paid little attention to file-system protocols," says Sean Derrington, Symantec's director of storage management.

Storage resource management (SRM) vendors have overcome this hurdle by directly integrating their suites with arrays from leading storage vendors, with NetApp enjoying the most extensive support among arrays supporting both NAS and SAN. For instance, Symantec CommandCentral 5.0 has been integrated with NetApp filers down to the file level across multiple filers and multiple locations. "We're able to monitor performance and capacity across multiple filers and locations using a combination of SMI-S and Data Ontap APIs, a capability currently only available for Network Appliance filers," explains Derrington.

Similarly, vendors that offer both multiprotocol arrays and SRM apps such EMC ControlCenter and Hewlett-Packard (HP) Co. Storage Essentials typically provide tighter integration and more advanced management options for their own multiprotocol arrays.

"For NetApp filers and the HP StorageWorks All-in-One Storage System, we offer a Storage Essentials agent that extends the full SRM capabilities to file-system protocols," says HP's Dean Schneider, marketing planning manager for Storage Essentials.

Array features
The features in some multiprotocol arrays are on par with, and in some cases ahead of, single-protocol arrays. Snapshot and replication have become standard features in arrays supporting NAS and SAN. While all vendors support replication, they differ in their implementation of it. Does the array support synchronous and asynchronous replication? Can replication be performed at a file and block level, or is it limited to one of the two protocols? NetApp and Pillar Data Systems, for example, support synchronous and asynchronous replication, and storage administrators have a choice of file- and volume-based replication.

Multiprotocol arrays provide a wider range of backup options than block-based arrays. Besides snapshot and replication, users can directly back up data through CIFS and NFS file-system protocols without going through another server. Moreover, the majority of multiprotocol arrays--Windows Storage Server excluded--support the Network Data Management Protocol (NDMP), which is optimized for backup and eliminates the need for installing a backup agent on the storage system. Being Windows based, the lack of NDMP support in Windows Storage Server isn't much of an issue, as most Windows Server 2003-compatible backup applications will also run on Windows Storage Server.

Thin provisioning is becoming increasingly important and is available in multiprotocol arrays from EMC, HDS and NetApp. Deduplication is another feature making inroads. While a few vendors have some level of deduplication, NetApp is currently the only multiprotocol array vendor offering a deduplication option for all of its filers.

If both file- and block-based protocols are required, procuring a multiprotocol array or NAS gateway is significantly less expensive than buying two arrays. "As a rule of thumb, you can add about 10% of the array price for additional protocols," says NetApp's Bennett.

In general, arrays from leading storage vendors like NetApp and EMC include only a limited number of features in the base price, and extra features and protocols have to be bought separately. Some customers are repelled by having to pay for features they expected to be part of the base price and are driven to array vendors with a more inclusive pricing model.

Product assessments
Here's how the multiprotocol arrays from the various vendors stack up:

EMC. As the leading FC array vendor, EMC has the benefit of selling its Celerra multiprotocol NAS and gateways into its large customer base. With a feature set similar to that of NetApp, as well as the ability to complement its block-based arrays, EMC is a strong player in the combined NAS/SAN array market. Unlike NetApp, however, EMC doesn't enjoy the benefits of a unified array architecture; as a result, administration and management of the different array families varies.

"With Symmetrix, Clariion, Celerra and Centera, EMC has four different solutions, each with its own code base and architecture, and it would make sense for EMC to head to a unified solution," says Brian Garrett, technical director, ESG Lab at Enterprise Strategy Group (ESG), Milford, MA.

HDS. The FC array behemoth has put little in-house development into file-system protocols. With the exception of a NAS blade for its TagmaStore USP, HDS relies on its relationship with BlueArc Corp. for file-system protocol support, offering the Hitachi high-performance NAS platform (BlueArc NAS gateway) to customers who need file-system protocol support beyond the USP NAS blade.

HP. As a traditional block-based array vendor, HP has entered the NAS market with two offerings. At the high end it offers HP Scalable NAS, which is mostly NAS, although it supports iSCSI if used with an iSCSI-enabled array. For the SMB market, HP offers the HP StorageWorks AiO Storage System, which is available in different flavors depending on how much storage is required. Although AiO runs Microsoft Windows Storage Server, the addition of tools like Application Storage Manager and HP's superb support organization have made AiO one of the simplest multiprotocol arrays on the market.

IBM. The company, even more than HDS, has decided not to develop file-system protocol technologies in-house. Instead, IBM OEMs and sells NetApp filers and gateways to customers in need of NAS protocols.

Microsoft. The company entered the multiprotocol market by acquiring iSCSI target software from WinTarget in 2006, providing an iSCSI option for its Windows Storage Server NAS. With about 50 Windows Storage Server OEMs, including powerhouses like Dell Inc. and HP, Windows Storage Server has evolved into a formidable player in the multiprotocol array market. Although Microsoft has traditionally dominated the SMB market with Windows Unified Data Storage Server (WUDSS) 2003 and its out-of-the-box iSCSI support and clustering, it's now also competing in the enterprise space.

NetApp. The first array vendor to offer block- and file-based protocol support in its products, and it clearly leads the pack. Most importantly, among the leading array vendors, NetApp is the only one with a unified array architecture. From its low-end FAS200 to the high-end FAS6000, NetApp's arrays are all based on a single architecture and run the same software. A large number of features, combined with ease of use, make NetApp well positioned in the multiprotocol array market. "NetApp has always been easier and EMC has always been more complex," says Garrett.

Pillar Data Systems. A Larry Ellison-financed startup, Pillar has supported FC, iSCSI, NFS and CIFS in its Axiom arrays from day one. A scalable architecture, offloading of file-system protocol and RAID processing to so-called "Slammers," and cluster support make the Axiom array family a great fit for SMBs and enterprises. Tight integration with Oracle Corp. tools like Oracle Enterprise Manager makes Axiom arrays a perfect fit in Oracle environments. "By having separate data paths for iSCSI, FC and file-system protocols, Pillar has one of the best-performing multiprotocol arrays in the market," says Gartner's Cox.

The demand for arrays that support both NAS and SAN protocols is on the rise. The sweet spot for a multiprotocol array is an environment that needs the flexibility of file- and block-based protocols, and where features and ease of use are more important than very high performance. "Virtualization will drive multinetworked and multiprotocol solutions," says Garrett. "And [the vendor that] can achieve it in a simple, integrated, scalable and affordable fashion will be the big winner."

Dig Deeper on Data center storage