When Vancouver Coastal Health (VCH) ran out of power and space in its data center last year, the healthcare services organization began virtualizing its servers to free up resources. That plan worked on the server side, but contention and performance bottlenecks soon sprang up on the healthcare group's HP EVA disk arrays.
With 60% of its server environment virtualized, VCH had too many hosts hitting on the same storage array LUNs at once, creating "hot spots" of contention and leading to application problems because of latency. Ben Haley, regional supervisor for storage and backup, said that VCH's Windows clusters failed over twice because of the latency, causing the operating system to time out waiting for data.
So VCH went looking for a performance monitoring tool late last year, but found only one -- Akorri Inc.'s BalancePoint -- that had visibility into VMware, host applications, operating systems and servers. (Haley said he didn't investigate Symantec's CommandCentral tool. Tek-Tools Inc. has also since expanded its reporting tools to include VMware). A consultant working with VCH recommended Akorri's performance monitoring software.
"The best thing about [BalancePoint] is the topology view, which shows us the path from the VMware host to the LUNs," Haley said. Before VCH rolled out BalancePoint, performance and contention issues on the disk arrays occurred about once a week, but Haley said, "Now it's more like once a month."
Deploying the performance monitoring software led VCH's IT staff to discover other existing problems. "Within the first two hours of deployment, we found two Exchange servers that had logs and mailbox data allocated to the same disk groups," Haley said.
BalancePoint's reports have also cut down on the work required to prepare stats to provide management each quarter. "We used to have to go through each server ourselves and figure out the metrics," Haley said. "It's probably saved us 20 hours every quarter preparing reports."
Performance issues still pop up at times because of a lack of communication between the application and infrastructure teams, such as the time that a database administrator changed a hot backup window to coincide with a heavy traffic time on the network. VCH stores 190 TB on five EVA arrays, with an average utilization rate around 70% per array. That can make it more susceptible to performance issues because there's less room for error.
@ 53880 "Contention happens with physical servers, too," Haley said, adding "I'm not a big fan of how VMware lays out its volumes." Along with the performance problems on the HP arrays, VMFS also makes it tricky to do snapshot backups on VMware volumes, he said.
Haley said he also wishes for better integration between HP and VMware, and "better performance monitoring, for sure." HP's EVA-perf utility isn't integrated with BalancePoint, though Haley said he's been pushing for it.
While Akorri offers him a good "server's eye view" of the network, integration with HP's storage monitoring tools would give him more detailed information while helping him organize that information better. "EVA-perf produces tons of data that can be hard to parse out," he said. "Akorri could help with that, and integrating that tool means we could get better insight into our disk arrays."