Q
Manage Learn to apply best practices and optimize your operations.

# How does RAID5/parity really work?

How does RAID5/parity really work? Does the parity block act as a temporary buffer? What exactly is going on here...

(XOR) wise? Also could you point me to a reference?

RAID5 works by gathering up three or more physical disk drives (usually from 3 to 32 but you usually see 15 as the max in most RAID arrays) and logically combining them into a single logical entity.

RAID5 uses "distributed parity." This means as parity information is generated for each write command to the logical disk combined into a RAID5 "set", the parity data is "striped" across all the disks in the set.

Parity for a RAID5 set is generated, usually in hardware by the RAID controller, by a process known as: "Read, modify, write." On a RAID5 set, a write operation is not complete until the following process occurs:

1. Read old data from the disks
2. Read the old parity from the disks and calculate the difference between the old data and the new data to be written, using an XOR operation
3. Write the new data to disks
4. Write the new parity to disks

This means that for every write operation, four operations actually happen. This is known as the "RAID5 write penalty." RAID5 is a tradeoff between speed, reliability and disk space. RAID5 has an advantage of using less disk space for parity data than RAID1 mirror sets but there is a performance hit due to the overhead.

The actual mathematical calculation of the parity data is done as follows:

The parity is generated by grouping together the bits (0 an 1s) to be written, then flipping the bits for the parity data by making the 1 bits 0s and the 0 bits 1s. (This is an XOR operation.) A bit (0 or 1) is then added to each group of data bits so that it will have either an odd or even number of 1s. When reading back the parity, if the parity that was generated was odd, then any group of bits that arrives with an even number of 1s for that data must be in error. The data can then be regenerated using the parity information from the other disks.

Some storage array controllers can eliminate most of the RAID5 write penalty by gathering writes in mirrored cache, calculating the parity only once for all the data gathered, then writing it down to disks. This reduces the number of reads and writes that need to occur for multiple write streams which is very efficient for streaming write data like transaction logs for databases.

The best place I found to reference this stuff, is from a book called "The RAID Book", which was published by Digital press a few years back.

Chris

Editor's note: Do you agree with this expert's response? If you have more to share, post it in one of our .bphAaR2qhqA^0@/searchstorage>discussion forums.

This was last published in November 2002

## Content

Find more PRO+ content and other member only offers, here.

#### Have a question for an expert?

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

#### Start the conversation

Send me notifications when other members comment.

## SearchDisasterRecovery

• ### Don't let your BC/DR plan get lost in the shuffle

How can you maintain consistency in your BC/DR strategy in a world of constantly evolving technology? If it's a reactive part of ...

• ### Zerto Virtual Replication reduces marketing firm's RPO

Maritz needed a short RPO and flexible cloud provider options for its recovery process. Zerto replication protects applications ...

• ### Nakivo Backup & Replication software gets automated

Nakivo updates its software for quick and automated recovery. While the vendor has a focus on virtual data protection, it has ...

## SearchDataBackup

• ### Modern data backup technologies afford merging opportunities

Through technologies such as copy data management, vendors are merging primary and secondary storage. There's more convergence ...

• ### Criteria for vetting appliance-based data backup systems

Knowing the right questions to ask when vetting data backup appliance vendors can help ensure you select the product that will ...

• ### Spanning Backup for Salesforce enhances metadata restore

Spanning Backup makes it easier for Salesforce administrators who deal with metadata. Direct restore improves self-service of ...

## SearchConvergedInfrastructure

• ### Ten hyper-converged infrastructure architecture buying mistakes

Buying hyper-converged infrastructure systems can be less stressful if you learn about and avoid these 10 common mistakes that ...