I think there is a need for better price/performance data in the world of NAS. We have seen NAS go from a product that a little company called NetApp used to push to a multi-billion dollar business. Of course, the two giants that own the lion's share of the market are NetApp and EMC, but over the past five years the market has attracted a slew of newcomers, including BlueArc, Exanet, Isilon, OnStor and PolyServe, among others.
Of course, HP has a NAS line built around their StorageWorks product line and IBM finally recognized that not having a NAS product line was sinful in this market and made a deal with NetApp. Sun, having created NFS and at least the concept of network file serving, wasted a decade fighting NAS and finally yielded and bought Procom. Microsoft entered this market aggressively a few years ago with their Windows Storage Server 2003 software and enabled a pretty darn healthy low-end NAS market via Dell, HP and other server vendors. The point is that the NAS market is real and IT is betting millions of dollars on NAS products.
There is a lot to be said for the seasoned vendors that have a wealth of software tools to make using and managing a NAS box easier. I would be hesitant to buy a NAS box today that didn't support snapshots, mirroring or replication and a slew of other applications that we all take for granted now. But, once you have established that the vendor meets the baseline criteria, how does one make the decision that maximizes value? That is where price/performance data comes into play. But first, let's take a quick look at what benchmarks are commonly used to measure performance of a NAS system.
There are two that you should be aware of. In the NFS world, the one that matters is SPECsfs, a benchmark developed and managed by the SPEC organization. (If we had a few beers in front of us and a lot of time, I would even tell you the part yours truly played in the creation of that organization. But, I digress.) SPECsfs basically measures the number of NFS operations that a NAS box can deliver at a certain average response time, which is measured in milliseconds (msec). Vendors can show the results up to an average response time of 50 msec. The shape of the curve tells a lot about how the system would behave as more clients were added and when response time became unacceptable. The test can be conducted using TCP or UDP (UDP, being a lighter protocol, yields bigger numbers). You can find the results of most popular NAS products at www.spec.org.
In the world of CIFS, the benchmark most often used is NetBench, originally created by Ziff Davis and later managed by Veritest. Basically, NetBench shows the throughput in MBps across the Y-axis and the number of test systems on the X-axis. The NAS box is pounded by Windows clients (modeled by test systems) with a pre-defined load and the throughput is measured. The number of test systems is increased until the NAS box croaks. Microsoft uses this benchmark extensively, for what I believe are obvious reasons.
There is a third benchmark, called iometer that you will run into. This is really a disk performance measuring test and not relevant for NAS but is sometimes thought to be. Ignore that one.
I am going to use the SPEC benchmark to make my point: Vendors go to great lengths to show the highest number possible, while still staying below the 50 msec response time. Since the benchmark does not specify the number of disk drives or the number of file systems or the type of RAID used, vendors use all kinds of tricks to maximize this number. A whole lot of millions is riding on that one number, so they fine tune the darn thing to the hilt, as I would if I were in their shoes. I was for many years.
Spindle count is an important aspect of this benchmark, so don't be surprised that vendors use lots of spindles, even if no one would ever buy that configuration. But that is exactly what I am leading up to: I want the vendors to not only show the configuration they use (which, by the way, they have to, as SPEC requires it), but to price it using their list prices. That way, if they have to use an absurd configuration to achieve the results, then let the price show that. Let the buyer then decide if they want to buy the more expensive unit on the basis of vendor reputation or the sophistication of their software tools, etc. I am totally fine with that.
I also think that as we get closer to the entire industry supporting global namespace (all the new clustered file system and distributed file system-based products provide a single file system image) that we also need an apples-to-apples performance benchmark for this from all vendors. But, I will tackle that issue another time. For now, let's focus on the SPEC and NetBench results and price those configurations. If you agree there is merit in this approach, then start asking your favorite vendor to provide the priced configuration. Then, even if you stay with the existing vendor, you will do so with your eye open to the value you are (or are not) getting.
I would be keen on your feedback on this. I would also entertain feedback from the NAS vendors, both new and established. You can reach me at firstname.lastname@example.org.
About the author: Arun Taneja is the founder and consulting analyst for the Taneja Group. Taneja writes columns and answers questions about data management and related topics.