This article can also be found in the Premium Editorial Download "Storage magazine: A look inside Hitachi's TagmaStor high-end arrays."
Download it now to read this article plus other related content.
|SANscreen pinpoints configuration problems|
SANscreen pinpoints configuration problems: SANscreen shows where proposed changes will violate operational policies and possibly cause a disruption in storage area network service. In this example, the policy on the authorized access path between Host A and Storage 2 has been violated, perhaps due to a failed HBA or Switch 2 reboot.
Data sources are the processes that send information to the SANscreen server. They're associated with acquisition units and tied to individual SAN devices. To add a data source, you must have the IP address, and the administrative user name and password of the related SAN device or management station.
With a single storage group managing an entire SAN, providing SANscreen with SAN device user names and passwords wouldn't be considered a security threat if the management network is hardened against outside access. However, in larger organizations, turf wars may erupt because business unit administrators may not want to share passwords with a separate enterprise storage group.
A better solution might be to provide the business unit administrators with an interface to change user names and passwords for the devices they manage. The enterprise storage group would still be able to get a telnet or Web session with the SAN device, but only through SANscreen where accounting takes place. Outside of SANscreen, only business unit administrators with the correct passwords would be able to gain direct access, thus enhancing security.
SANscreen violations result when a policy is broken on an authorized access path. Violations signal that an unauthorized change has occurred or indicate that an impending change will violate a policy. An authorization wizard is used to associate policies to access paths; policies provide the basis by which SANscreen determines if the path is healthy or not--shown as green or red on the topology display.
During the authorization process, a previously defined path is chosen from a list and assigned a policy. Policies include a redundancy level for the path, minimum number of host ports having access to a particular LUN and maximum switch hops between end points, for example.
After a path has been authorized, SANscreen will apply events received from data sources to the access path's policy to determine if a violation has occurred.
Violations can include unauthorized or missing paths, or a reduction in redundancy due to a failed inter-switch link (ISL) or HBA. Clicking on the violations icon launches a screen with a list of violations in the main table; when a specific violation is selected, the access path with the violation is highlighted on the topology pane (see "SANscreen pinpoints configuration problems"). Selecting the device involved in the violation displays the details pane that shows the port connectivity of the device and a list of recent changes for the device.
There are two reasons for violations to occur. An unexpected change event in the SAN will trigger a violation. A violation will also occur if there was a planned change in the SAN, but the authorized path list wasn't updated. Unexpected changes are easy to track to determine the root cause. The second instance is more of an "oops, I forgot" thing and is remedied by updating the authorized path list.
A historical list of changes can be viewed by clicking the changes icon in the shortcut pane. The time-stamped list shows all the changes on a managed data source since a given time. During our evaluation, zoning, LUN masking and device addition and removal changes were all tested successfully. Administrators can also simulate rolling a SAN configuration back to specific changes and points in time to see exactly what change broke a specific policy or caused an error.
The vulnerabilities screen displays data sources that are at risk of being in violation if included in an authorized path. Some examples of vulnerability types are blocked hosts, blocked ports, high LUN utilization per storage device and unconnected zone ports.
Although the list of vulnerability types is fairly comprehensive, I'd like to see SANscreen predict hardware vulnerabilities as well. For example, the link error status block is part of the Fibre Channel Physical (FC-PH) standard. It maintains counters such as "loss of signal" and "bad CRC." When these counters increment quickly, it usually means that a gigabit interface converter or other physical component between ports is about to fail. These counters offer insight into the physical health of a SAN, so it seems natural to include them in SANscreen's analysis.
SAN changes are planned, prevalidated and stored in the tasks screen using a MySQL database. You can create tasks with unique reference numbers and associate them with actions necessary to complete the task. You can perform a prevalidation of a change by logically stepping through action keywords (create, add, connect, authorize) to flag potential errors.
SANscreen has two pricing models, perpetual and subscription based. The perpetual license starts at $125 per port and the subscription model is priced at $64 a year per port with volume discounts available.
Considering the ROI related to maintaining SAN stability during the change process, SANscreen's cost is nominal when compared to application downtime and the man hours required for manual changes.
SANscreen 2.5.2 is must-have software for large, complex SANs that undergo frequent configuration changes. SANscreen steers change management away from Murphy's law and puts the SAN administrator back in control.
This was first published in September 2004