Manage Learn to apply best practices and optimize your operations.

SAN performance best practices: More ways to avoid bottlenecks

There are several ways you can fine-tune and improve your SAN performance. Follow our five tips to avoid data storage bottlenecks and improve SAN efficiency

What you'll learn in this tip: There are several ways you can fine-tune and improve your storage-area network (SAN). This tip covers topics such as using ISLs and understanding HBA queue depth to help you avoid storage bottlenecks.

In this tip, we take a closer look at how SAN performance and SAN efficiency improve with transparency, testing and a better understanding of the impact your data storage has on the rest of your system. Check out our earlier tip on how to improve your storage networks to find out how storage performance issues are often linked to data storage networks with outdated information or that don't undergo regular testing.

Tip 1. Understand how you're using ISLs

Inter-switch links (ISLs) are critical areas for tuning and, as a SAN grows, they become increasingly important to performance. The art of fine-tuning an ISL is often an area where different vendors will have conflicting opinions on what a good rule of thumb is for switch fan-in configurations and the number of hops between switches. The reality is that the latency between switch connections compared to the latency of mechanical hard drives is dramatically lower, even negligible; however, in high fan-in situations or where there are a lot of hops (servers crossing multiple switches to access data), ISLs play an important role.

The top concern is to ensure that ISLs are configured at the correct bandwidth between the switches, which seems to be a surprisingly common mistake. Beyond that, it's important to measure the traffic flow between hosts and switches, and the ISL traffic between the switches themselves. Switch reporting tools will provide much of this information but, a visual tool that measures switch intercommunication may be preferable.

Based on the traffic measurements, a determination can be made to rebalance traffic flow by adjusting which primary switch the server connects with, which will involve physical rewiring and potential server downtime. Another option is to add ISLs, which increases bandwidth but consumes ports and, to some extent, further adds to the complexity of the storage architecture.

Tip 2. Use NPIV for virtual machines

Server virtualization has changed just about everything when configuring SANs and one of the biggest challenges is to identify which virtual machines are demanding the most from the infrastructure. Before server virtualization, a single server had a single application and communicated to the SAN through a single host bus adapter (HBA); now virtual hosts may have many servers trying to communicate with the storage infrastructure all through the same HBA. It's critical to be able to identify the virtual machines that need storage I/O performance the most so that they can be balanced across the hosts, instead of consuming all the resources of a single host. N_Port ID Virtualization (NPIV) is a feature supported by some HBAs that lets you assign each individual virtual machine a virtual World Wide Name (WWN) that will stay associated with it, even through virtual machine migrations from host to host. With NPIV, you can use your switches' statistics to identify the most active virtual machines from the point of view of storage and allocate them appropriately across the hosts in the environment.

Tip 3. Know thy HBA queue depth

HBA queue depth is the number of pending storage I/Os that are sent to the data storage infrastructure. When installing an HBA, most storage administrators simply use the default settings for the card, but the default HBA queue depth setting is typically too high. This can cause storage ports to become congested, leading to application performance issues. If queue depth is set too low, the ports and the SAN infrastructure itself aren't used efficiently. When a storage system isn't loaded with enough pending I/Os, it doesn't get the opportunity to use its cache; if essentially everything expires out of cache before it can be accessed, the majority of accesses will then be coming from disk. Most HBAs set the default queue depth between 32 to 256, but the optimal range is actually closer to 2 to 8. Most initiators can report on the number of pending requests in their queues at any given time, which allows you to strike a balance between too much and not enough queue depth.

Tip 4. Multipath verification

Multipath verification involves ensuring that I/O traffic has been distributed across redundant paths. In many environments, our experts said they found multipathing isn't working at all or that the load isn't balanced across the available paths. For example, if you have one path carrying 80% of its capacity and the other path only 3%, it can affect availability if an HBA or its connection fails, or it can impact application performance. The goal should be to ensure that traffic is balanced fairly evenly across all available HBA ports and ISLs.

You can use switch reports for multipath verification. To do this, run a report with the port WWNs, the port name and the MBps sorted by the port name combined with a filter for an attached device type equal to "server." This is a quick way to identify which links have balanced multipaths, which ones are currently acting as active/passive and which ones don't have an active redundant HBA.

Tip 5. Improve replication and backup performance

While some environments have critical concerns over the performance of a database application, almost all of them need to decrease the amount of time it takes to perform backups or replication functions. Both of these processes are challenged by rapidly growing data sets that need to be replicated across relatively narrow bandwidth connections and ever-shrinking backup windows. They're also the most likely processes to put a continuous load across multiple segments within the SAN infrastructure. The backup server is the most likely candidate to receive data that has to hop across switches or zones to get to it.

All of the above tips apply doubly to backup performance. Also consider adding extra HBAs to the backup server and have ports routed to specific switches within the environment to minimize ISL traffic.

This article originally appeared in Storage magazine.

Dig Deeper on SAN management