Faster disk drive rebuilds: Hot Spots


This article can also be found in the Premium Editorial Download "Storage magazine: Choosing the best disaster recovery planning tool."

Download it now to read this article plus other related content.

New data protection schemes

Storage vendors are finally beginning to understand that it's not about protecting disks but protecting information, and their data protection schemes are evolving to reflect this. There are some novel approaches in the market to solving the problems produced by large, slow drives. Some technologies reduce the overall number of rebuilds a system performs. Some have shifted to information-based protection schemes in which, rather than mirroring a disk, they mirror information (files, chunks or objects). Some even do a little of each. So how does this impact rebuild times? When you think in terms of rebuilding information rather than a single disk, you can put the power of the system architecture to work, leveraging the massive parallelism opportunity presented by multidisk architectures.

There are several technologies in the market today that reduce the overall number of drive failures, and thus the number of rebuilds required. In some instances, vendors take unresponsive drives offline to diagnose problems and return them to service if no trouble is found. This is a great approach, as it eliminates the need to perform a full rebuild. When the drive goes offline, the system journals all writes that would have gone to that drive while attempting to recover the drive. After a successful recovery, only the data in the journal is required to be rebuilt, not the entire disk.

Some vendors have a two-pronged approach

Requires Free Membership to View

that reduces the overall number of rebuilds required and speeds rebuild time leveraging grid storage architectures. One approach kicks in when a drive doesn't respond immediately to an access request. The system responds by doing a mini parity rebuild of the requested data and returning the rebuilt data while taking the non-responsive drive temporarily out of service. This drive then undergoes a brief diagnosis and is returned to service, thereby eliminating the need for a rebuild. Any data written while the drive is offline is written to other available space in the system.

This also speeds rebuilds by putting its grid architecture to work. Most grid-based architectures have capacity or storage nodes and separate processor nodes. Typically, all processor nodes can access all capacity nodes. When data is written, it's broken into a number of fragments. These fragments are then distributed across as many storage nodes as are in the system. Using a default of nine data fragments and three parity fragments (the exact number of parity fragments is user configurable), each of 12 storage nodes would get a fragment. If there are four storage nodes (the minimum configuration), each node gets three fragments. In the event of a drive failure, the data from that drive is rebuilt, just like in conventional hardware RAID. But unlike conventional RAID, data isn't rebuilt to a single drive; the data is redistributed across the storage nodes leveraging any available storage capacity. If an entire storage node fails, the data from those drives is rebuilt across the remaining storage nodes. We've seen this type of technology implemented for both parity-protected data and mirrored data. Thanks to protecting data rather than disk drives, as well as the power of a grid architecture, rebuilds happen in a fraction of the time it would take for a conventional drive rebuild. It's the information that's being rebuilt, not the exact drive layout.

This was first published in January 2009

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: