Manage Learn to apply best practices and optimize your operations.

How to replace a failed drive in an array

Rick Cook explains how to avoid replacing the wrong drive in an array.

What you will learn: Rick Cook explains how to avoid replacing the wrong drive in an array.

Replacing the wrong drive in an array is one of those Homer Simpson moves that happen all too frequently. It not only makes you feel foolish but, unless you're using a scheme like RAID 6, you're likely to corrupt the entire array.

When you've got 60 drives or more crammed into one drawer, you need to make sure you identify the right one. Haste is your biggest enemy here. The rule is to positively, absolutely, identify which drive has failed before you pull it. Then double-check your work.

When installing drives in an array, make a note of the serial numbers of the drive assigned to each OS identifier and then keep that information up to date as you replace drives.

More SAN information
Learn all about SAN expansion 

Tutorial: Creating a tiered SAN architecture

How to create a SAN performance baseline
If you can still boot up the system, you can reboot and check the controller's RAID BIOS. The BIOS will often identify the drives by serial number and should tell you which drive failed. Once you have the serial number of the bad drive it's a simple matter to check the serial numbers of all the drives in the array. Just remember that the original drives in an array typically have very similar serial numbers. Again, make sure you've got the right one.

A number of enclosure vendors have warning LEDs on their racks, especially in hot swap systems. An orange or red light will go on to indicate which drive is bad. Many vendors' management software will identify the failed drive by its location in the tray or the rack and some will even draw you a map. The most error-prone identification system involves identifying the drive in a matrix, for example A3 where A is a row and 3 is a column. A map is more helpful, providing you orient it correctly to the drive rack or tray. Again, check and double check before touching the drives.

So how, you may ask, with all that help and all these warnings, can you possibly make a mistake even on an array with all these helpful features? Trust me, you can. (Don't ask me how I know, okay? Just trust me that it's possible.)

About the author: Rick Cook specializes in writing about issues related to storage and storage management.

Dig Deeper on Primary storage devices

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.