As I sit here recovering from a nasty case of the flu and subsequent sinus infection, I have had time to reflect on the health of the many storage networks I hear about. Many of the calls and e-mails I get are usually about urgent user issues. During the course of the year, storage network admins are inundated with urgent problems. The "to do" list never seems to get shorter, one very harried storage admin lamented.
Many of the urgent problems that crop up are the direct result of not taking care of the mere "important" issues. For example, when that Fibre Channel (FC) HBA failed, the application server switched over to the failover HBA and suddenly couldn't see its storage. The phone started ringing off the hook with agitated users and a very upset application admin. After numerous painstaking hours of troubleshooting, the storage network admin figured out that the physical path of the backup HBA and the hard zone did not match up. Could this crisis have been avoided?
Yes, it could have, if the storage network admin had used automated or even manual network verification tools. Network verification is an important task that takes a back seat to the urgent ones until a problem occurs. Granted there is still only one SAN verification tool on the market today (Onaro's SANscreen) and it is a bit expensive. But, how expensive is 10 hours of application downtime?
That's just one example. When was the last time you really attempted to recover from your backup? How about a partial recovery? Do you know it will work? Or do you hope it will work? There really are backup and CDP software products that simplify the testing issue and guarantee recovery.
How about providing requested information to the corporate attorneys in the case of legal discovery from a lawsuit? Can you do it? Can you do it in a timely efficient manner? If you can't, how much will it cost the organization? Again, there is effective software that simplifies this process.
And security is always a hot topic of discussion. Yet, most of the implementations seem to occur after there has been a breach. This is another example of the important not getting done, causing an urgent issue later.
And of course there is always storage resource management (SRM). This is the classic important problem that rarely gets implemented leading to urgent provisioning issues.
So, what can a storage network admin do to lighten their urgent load? Focus on the important as well as the urgent. Set aside and schedule some time every day for important tasks before they become urgent. Here are some best practices that can make a difference in the positive health of the storage network:
- Verify that SAN implementation matches the organization's policies and will work as you think they do. And, be sure to verify it every time a change is made to the SAN.
- Test backups at least once a quarter to make sure that you can really recover from them when you need to.
- Change the default password on your SAN switches/directors.
- Make sure the SAN switch Ethernet management port is on a completely isolated LAN segment.
- Make sure your backups are encrypted in-flight and at rest.
- Implement effective SRM that allows you to plan your storage instead of having the urgent storage requirements manage you.
Spend more time on the "important" aspects of your networked storage and there will be a lot less of the "urgent" bushwhacking your day. Now, I just have to remember to get a flu shot this year.
Did you know:
About the author: Marc Staimer is president and CDS of Dragon Slayer Consulting in Beaverton, Oregon. He is widely known as one of the leading storage market analysts in the network storage and storage management industries. His consulting practice of six plus years provides consulting to the end-user and vendor communities.
This was first published in February 2006