Like other areas of storage-area network (SAN) management, virtual server proliferation has made the job tougher for storage performance monitoring. Traditional tools
However, the latest storage performance monitoring products have been adapted to virtual environments, and some are even specifically built for hypervisor technologies.
Performance monitoring in virtual environments is moving away from siloed tools that individually gauge storage, networking and physical host performance. Instead, VMware Inc. and third-party vendors such as BlueStripe Software Inc. are developing an application-focused approach that aggregates performance metrics and presents an overall picture of system health to virtualization, storage and networking administrators. When issues arise, the tools can drill down for detailed information and find the source of the problem.
Storage performance monitoring tools in virtual environments
In a physical server environment, storage performance monitoring tools watch the operating system (OS) to measure the server’s performance statistics. Typically, there are only two ports for data flowing in and out of the hardware. It’s easier to identify the physical links to the storage, including direct LUN connections to applications.
A virtual environment introduces OS emulators and multiple virtual ports within each physical host, which will render performance monitoring tools built for physical environments unreliable.
"Virtualization has been a wonderful boon to the cost of platforms and the flexibility of where you host applications,” said Vic Nyman, BlueStripe’s founder and chief operating officer. “But it has also become an increasing challenge in how you assess what a given business application is using in terms of storage, how you measure its availability and how you gauge its performance."
Typical virtual environment storage issues include storage mapping and faulty configurations. Storage mapping problems arise when there's a storage-related performance issue, but you don’t know which physical host the application’s storage resides on. Misconfiguration issues arise when administrators make errors provisioning VM storage, or when the assumptions made at the time of the original provisioning no longer apply.
"Things are happening so fast,” said Bob Laliberte, a senior analyst at Milford, Mass.-based Enterprise Strategy Group (ESG). “You used to have these heavily siloed domains, and now all of a sudden everything is collapsing."
SCSI reservation troubles also plague data storage administrators. vSphere and Microsoft Cluster Service use SCSI reservations to ensure that VMs writing metadata changes have exclusive access to shared storage LUNs. If multiple hosts concurrently access and update shared resource metadata files, data could easily become corrupted. But if too many reservations are allowed, hosts attempting to access locked storage LUNs will get I/O failures and, after multiple attempts, the operation may ultimately fail.
According to Paul Turner, general manager at NetApp Inc.'s SANscreen business unit, these physical environment tools need to become virtualization-aware. “In a virtual environment, a lot of the existing tools [can] actually work really well,” he said. “They just need to become virtualization-aware. They need to know what the VM mapping is through to their storage.”
Virtualization-aware performance monitoring tools
While the need for virtualization-aware performance monitoring tools is becoming well known, there still aren’t that many of them on the market. According to Jeff Boles, senior analyst and director of validation services at Hopkinton, Mass.-based Taneja Group Inc., "performance monitoring in virtual environments is still massively under-served."
A discussion of storage performance monitoring tools in virtual environments begins with those built into VMware vSphere. The vSphere Client monitors storage performance for whole data centers, clusters, physical hosts or individual virtual machines. The Performance tab within the vSphere Client dashboard shows both an Overview and Advanced view. The Overview option shows key statistics, while the Advanced option provides more detailed information. By itself, the vSphere Client has only limited historical statistics for trending and planning purposes.
VMware vCenter Server, formerly called VMware VirtualCenter, provides much more detailed information, as well as better alerting. vCenter AppSpeed, which VMware obtained when it acquired B-hive Networks Inc. in May 2008, tracks transaction performance and measures latency and throughput in the virtual environment as a tab in vCenter. It also can give you application service-level agreement (SLA) status and troubleshoot application performance directly inside vCenter.
Vendor approaches to storage performance monitoring in the virtual world
Several vendors are tackling storage performance monitoring in the virtual world, including BlueStripe Software, NetApp, VMware and Virtual Instruments.
BlueStripe's FactFinder v5: BlueStripe Software has a different take on storage performance monitoring. The company’s flagship product, FactFinder v5, was released in March as an application-focused management software package. "We have a whole new approach to managing application systems,” BlueStripe’s Nyman said. “On the fly, we will automatically discover the applications and the transaction paths. We'll also go through the stack and tell you where that transaction is stuck. It's like a bridge between the customer transaction and the technology and systems that support it."
The BlueStripe agents and passive observer peers into every aspect of an application’s performance, including hypervisors, networking and heterogeneous storage systems. “We see the application, its dependency on the storage, and its performance and interaction,” Nyman said. But it doesn’t necessarily drill down to find the root cause of a problem. “We see when the application has issues with storage,” he explained. “We see when it has performance bottlenecks. We don't necessarily know why it may not be performing well on that storage.”
BlueStripe is not setting out to replace traditional storage performance monitoring tools. Just like VMware vCenter Operations, it aggregates data and presents an overall system-health view. When performance issues arise, administrators can drill down into dependent systems and get the right system manager to look at the performance monitoring and diagnostic tools for more specific analysis.
BlueStripe’s FactFinder and vCenter Operations are two products that show the market’s increasing interest in overall system monitoring tools, and decreasing focus on separate system monitoring silos. But a proven and robust storage performance monitoring tool like Virtual Instruments' VirtualWisdom is essential in today’s increasingly complex virtual environments. Either way, keeping tabs on your storage performance is essential today.
NetApp's Akorri BalancePoint: NetApp acquired Akorri Networks Inc. in February, and moved Akorri BalancePoint storage performance monitoring technology into the OnCommand management software suite with its existing SANscreen Service Insight storage management solution. SANscreen performs capacity planning and trending in VM environments. It can see all the capacity being used by all the virtual machines, how much each VM is using and what each cluster is using. It can also provide capacity planning reports and analytics.
“The reason we acquired Akorri was because we were seeing quite a significant need for better tools in virtual environments,” NetApp’s Turner said. “In particular, what we saw was the need for really good performance modeling tools and performance prediction tools.”
Turner said the Akorri Performance Index watches the CPU and memory headroom and utilizes queuing theory, a mathematical model based on simulated transactions and queues, to determine when a server limit will pass the set threshold. Alerts go into the vCenter console or email.
VMware's vCenter Operations: VMware released its own advanced analytics engine in March called vCenter Operations. VMware’s new advanced analytics engine aggregates storage, network, CPU and memory performance data from the vSphere hypervisor and presents it in one view as an overall system health metric.
“We view this as a new approach to both infrastructure and operations management,” said Rob Smoot, VMware's director of product marketing management. “It helps you get an aggregate view of the health, performance and capacity of the environment. Increasingly, we think the three disciplines of performance, capacity management and configuration management need to come together.”
vCenter Operations uses sophisticated algorithms to determine when the aggregate systems as a whole are behaving abnormally. Then you can drill down into the specifics to see which system -- storage, networking or the physical host -- is causing the bottleneck. Smoot said the goal is to aggregate alerting systems into three cores: workload, capacity and system health.
“What operations teams deal with today are alerts for specific silos and the underlying aspects of the infrastructure,” Smoot said. “So they get these alert storms where there's just tons of noise in the environment. What vCenter Operations does is cut through that noise and alert you when there's a building performance problem that spans all of those individual metrics.”
Virtual Instruments' VirtualWisdom: Virtual Instruments VirtualWisdom SAN optimization and troubleshooting software specifically focuses on storage performance monitoring within virtual environments. As the successor to NetWisdom, VirtualWisdom provides monitoring, optimization and troubleshooting for heterogeneous Fibre Channel (FC) storage networks.
Skip Bacon, chief technology officer at Virtual Instruments, said that without a detailed understanding of what’s happening under the hood, server virtualization can go bad quickly. “The good news and bad news about server virtualization is that you can spin-up new VMs very, very quickly, you can move VMs around very, very quickly, and in some cases with vMotion automatically,” Bacon said. “The bad news is that if you don't have a firm grasp of the underlying storage picture from both a capacity and performance perspective, then all that [dynamic computing] does is get you to the scene of the accident that much more quickly."
The VirtualWisdom platform has several components to deal with the problem. The VirtualWisdom Server runs on a Windows server platform. The ProbeVM software acquires performance metrics from physical servers and VMs. The ProbeV software gets data from the SAN FC switch fabric, while ProbeFCX sends the base server SCSI device transactions and link metrics. The VirtualWisdom alarms are policy based and can trigger email notification and SNMP traps, as well as execute scripts and vMotion transfers.
This was first published in April 2011