It began with a press release from QLogic and Microsoft lauding benchmark results showing “near-native throughput” on Hyper-V hosts attached to storage via Fibre Channel, to the tune of 200,000 IOPS.
Server virtualization analyst Chris Wolf of the Burton Group took the testing methodology to task on his blog:
The press release was careful to state the hypervisor and fibre channel HBA (QLogic 2500 Series 8Gb adapter), but failed to mention the back end storage configuration. I consider this to be an important omission. After some digging around, I was able to find the benchmark results here. If I was watching an Olympic event, this would be the moment where after thinking I witnessed an incredible athletic event, I learned that the athlete tested positive for steroids. Microsoft and QLogic didn’t take a fibre channel disk array and inject it with Stanzanol or rub it with “the clear,” but they did use solid state storage. The storage array used was a Texas Memory RamSan 325 FC storage array. The benchmark that resulted in nearly 200,000 IOPS, as you’ll see from the diagram, ran within 90% of native performance (180,000 IOPS). However, this benchmark used a completely unrealistic block size of 512 bytes (a block size of 8K or 16K would have been more realistic). The benchmark that resulted in close to native throughput (3% performance delta) yielded performance of 120,426 IOPS with an 8KB block size. No other virtualization vendors have published benchmarks using solid state storage, so the QLogic/Hyper-V benchmark, to me, really hasn’t proven anything.
I talked with QLogic’s VP of corporate marketing, Frank Berry, about this yesterday. He said that Wolf had misinterpreted the intention of the testing, which he said was only meant to show the performance of Hyper-V vs. a native Windows server deployment. “Storage performance wasn’t at issue,” he said. At least one of Wolf’s commenters pointed this out, too:
You want to demonstrate the speed of your devices, the (sic) you avoid any other bottlenecks: so you use RamSan. You want to show transitions to and from your VM do not matter, then you use a blocksize that uses a lot of transitions: 512 bytes…
But Wolf points to wording in the QLogic press release, claiming the result “surpasses the existing benchmark results in the market,” and claiming that “implies that the Hyper-V/QLogic benchmark has outperformed a comparable VMware benchmark.” Furthermore, he adds:
Too many vendors provide benchmark results that involve running a single VM on a single physical host (I’m assuming that’s the case with the Microsoft/QLogic benchmark). I don’t think you’ll find a VMware benchmark published in the last couple of months that does not include scalability results. If you want to prove the performance of the hypervisor, you have to do so under a real workload. Benchmarking the performance of 1 VM on 1 host does not accurately reflect the scheduling work that the hypervisor needs to do, so to me this is not a true reflection of VM/hypervisor performance. Show me scalability up to 8 VMs and I’m a believer, since consolidation ratios of 8:1 to 12:1 have been pretty typical. When I see benchmarks that are completely absent of any type of real world workload, I’m going to bring attention to them.
And really, why didn’t QLogic mention the full configuration in first reporting the test results? For that matter, why not use regular disk during the testing, since that’s what most customers are going to be using?
On the other hand, QLogic and Microsoft would be far from the first to put their best testing results forward. But does anyone really base decisions around vendor-provided performance benchmarks anyway?