Our countdown, brought to you by SearchStorage.com high availability expert Evan Marcus, includes some common sense tips for the everyday storage admin to follow.
#8: Examine system history
1. Which component has caused the most downtime historically?
- Keep availability statistics
- Publicize them when they are good
2. Examine root causes of these failures
- May be non-system cause, like poor source code control practices
- May be environmental, like power conditioning
3. Defend against them first
- Worry about lesser problems later
Looking for more great Evan Marcus information?
Check out the Evan Marcus availability tips section of SearchStorage.com.
Also, visit our bookstore for Evan's book: Blueprints for high availability: Designing resilient distributed systems.
Have your own tips for the everyday admin? Submit them here.
This material is copyright 1997-2002 by Evan Marcus and Hal L. Stern. It may not be used in whole or part for commercial purposes without the express permission of both authors.
This was first published in December 2002