Network attached storage (NAS) supports file-based applications, allowing organizations of all sizes to consolidate distributed file servers into a small number of dedicated file storage systems operating under a common file system like NFS or CIFS. In the past, NAS brought value to enterprise storage, but there were limitations in throughput, connectivity, reliability and scalability. As a result, mission-critical storage tasks still...
required a storage area network (SAN). Today, a new generation of high-performance NAS systems promises to overcome past limitations, bringing SAN-type features and capabilities to file-based storage.
What is high-performance NAS? How does it differ from regular NAS?
It's important to note that there is no single definition of high-performance NAS. Experts still disagree about what high-performance NAS is exactly, but we can define it by comparison to conventional NAS.
On the interface side, high-performance NAS will typically offer far more Ethernet ports than traditional NAS systems. For example, a common NAS box may include just one or two Ethernet ports, while high-performance NAS boxes can support 10, 16 or more Gigabit Ethernet (GigE) ports. Better connectivity is essential to handle more storage requests. The added connectivity can also improve NAS reliability through port aggregation and failover.
Although high-performance NAS systems use the same SATA or SAS disks, the disk controller engines within the system are highly optimized and allow greater storage scalability and throughput. For example, a high-performance NAS system may include multiple heads that are able to communicate with multiple disks simultaneously. Optimization also extends to specific I/O operations or data types. For example, a high-performance NAS platform may be optimized to handle a large number of IOPS, large sequential data streams or focus on NFS operations per second. "You've got high-performance NAS systems that are doing lots of concurrent file access or lots of metadata lookups -- it's not just throughput," says Greg Schulz, founder and senior analyst at the Storage I/O Group. "That's also where a lot of the high-throughput systems fall apart … they're not able to handle lots of small files or large numbers of metadata requests."
Clustering allows multiple NAS boxes to be interconnected to the network and each other. This increases capacity and throughput, while presenting a single pool of NAS storage. The use of clusters also builds in resiliency, so if one box in the cluster fails, it won't significantly impair the remaining elements of the cluster. "If you need more throughput, you go buy another node and place it on there," says Randall White, senior consultant at GlassHouse Technologies Inc.. However, some experts warn that NAS clustering does not automatically guarantee high performance. "BlueArc [Titan] is able to deliver performance that can rival some small clusters, but with a single node," Schulz says, noting that BlueArc's Titan products can also be clustered for even higher levels of throughput and NFS operations per second.
Another key attribute to high-performance NAS is the use of a global file system (GFS). A GFS is particularly prominent in clustered NAS systems because all nodes in the cluster can concurrently share the same pool of storage, allowing nodes to act independently or together. A GFS is frequently integrated into the NAS device's operating system, but in some cases, the GFS is applied as another layer of software added on top of the NAS architecture.
Does high-performance NAS present any deployment challenges?
High-performance NAS allows an organization to perform more work in less time, or reduce the number of file servers needed to do the same work. In some cases, organizations can simplify their NAS storage infrastructure and save energy at the same time. "I fit into my existing power profile; maybe I can offload some energy costs, or I ensure that I have enough power moving forward to support growth," Schulz says, citing the importance of green storage.
It's important for storage administrators to evaluate the underlying management requirements of a prospective high-performance NAS system. Most NAS users expect management efficiency because there are fewer file servers or conventional NAS systems to manage. However, experts note that the management implications for high-performance NAS systems can vary wildly. Some systems can offer high performance and throughput, but lack many enterprise features. Other systems may provide a wealth of capabilities, but demand considerable management effort. In addition, some high-performance NAS systems may demand special host software or drivers that can achieve even better performance or throughput at the expense of added software maintenance.
Performance can also be influenced by the applications using a high-performance NAS system. For example, some systems may be optimized for certain applications, but require tuning and load balancing for other applications. Find the system that is most appropriate for your specific data workload.
Some high-performance NAS platforms may use an open interface, but rely on proprietary hardware. Common examples include systems from Isilon Systems Inc. and Panasas Inc. that support NFS and CIFS, but oblige you to buy the hardware nodes and storage from that particular vendor. This may be a problem for organizations sensitive to vendor lock-in. Experts note that other high-performance vendors can interoperate with many different storage vendors, easing compatibility concerns.
What are the biggest mistakes or oversights in high-performance NAS?
Incorrect assumptions are often the biggest impediment to high-performance NAS systems. For example, one typical error might be selecting a clustered product assuming that it will handle large sequential files, or attempting to store transactional data without sufficient storage IOPS. You will not obtain the same performance results from every type of data, so it's important to understand the applications and data workloads first, and then select a platform that can be optimized for those workloads.
High-performance NAS systems can easily monopolize the available network bandwidth. "It will devastate your corporate network if you just slap this thing on and give it an IP address and go," White says, adding that new switching, network architecture changes and additional LAN bandwidth are often needed to accommodate high-performance systems. TOE cards can be used in each host server to ease processing loads and keep traffic spikes to a minimum.
Who is using high-performance NAS?
The issues of NAS performance became particularly apparent for Tippett Studio, a visual effects and computer animation studio in Berkeley, Calif. The studio had employed a cluster of SGI Origin 9500 series NAS platforms for main storage. There is no SAN. But, the traditional NAS cluster simply could not keep pace with the I/O from a busy rendering farm of over 1,000 CPUs. The engineering team at Tippett Studios had to select a new NAS system that could host 50 TB of current NAS storage, while providing the necessary level of performance, expandability and management ease.
The engineering team ultimately settled on a Titan 2200 from BlueArc Corp., but only after careful testing. "We measured the performance of the same jobs from our render farm against the previous NAS and then the replacement NAS," says Daniel R. Basse, systems manager at Tippett Studio, noting that functional aspects of the system were also evaluated, including failover and other reliability features. "We had something on the order of a six-fold improvement in performance," he says. Manageability has also improved. The legacy NAS required several servers and applications dedicated to gathering storage metrics. The Titan 2200 provided those same tools out of the box, eliminating chatty tools and easing network traffic overhead. Moving to a single box has also reduced complexity, which has resulted in less staff management time and lower downtime.
However, the move to high-performance NAS did cause an unforeseen wrinkle. It is actually too efficient for existing back-end applications. Renders are now being completed too quickly, which is overwhelming the studio's home-grown batch queuing system. "We actually have to replace our batch queuing system with something that can keep up with the render farm," Basse says. "It's had the net effect of evolving our network and moving our business forward, which is good."
Basse sees the studio's storage needs doubling to 100 TB within the next two years, but fully expects high-performance NAS to continue meeting performance and reliability needs. "The next job we're looking at doing is 3D stereo," he says. "And that will again double the size of our storage requirements." Using NAS allows storage to expand easily and economically as projects expand and become ever-more complicated.
What is the future of high-performance NAS?
Experts agree that high-performance NAS is moving into the mainstream, but many of the features found in traditional NAS are not yet available in high-performance NAS systems. In the future, high-performance NAS vendors should increasingly offer many popular NAS features, such as snapshots, replication, point-in-time (PIT) copies, finer management granularity, better load balancing and data migration.
High-performance NAS has also renewed an interest in storage virtualization. Just as VMware allows servers to be broken up into virtual machines and used throughout the enterprise, storage virtualization aggregates and allocates storage without regard for its physical location. This is an important attribute for high-capacity storage systems, especially systems that can access storage located outside of the box.