Q
Manage Learn to apply best practices and optimize your operations.

# How does RAID5/parity really work?

How does RAID5/parity really work? Does the parity block act as a temporary buffer? What exactly is going on here...

(XOR) wise? Also could you point me to a reference?

RAID5 works by gathering up three or more physical disk drives (usually from 3 to 32 but you usually see 15 as the max in most RAID arrays) and logically combining them into a single logical entity.

RAID5 uses "distributed parity." This means as parity information is generated for each write command to the logical disk combined into a RAID5 "set", the parity data is "striped" across all the disks in the set.

Parity for a RAID5 set is generated, usually in hardware by the RAID controller, by a process known as: "Read, modify, write." On a RAID5 set, a write operation is not complete until the following process occurs:

1. Read old data from the disks
2. Read the old parity from the disks and calculate the difference between the old data and the new data to be written, using an XOR operation
3. Write the new data to disks
4. Write the new parity to disks

This means that for every write operation, four operations actually happen. This is known as the "RAID5 write penalty." RAID5 is a tradeoff between speed, reliability and disk space. RAID5 has an advantage of using less disk space for parity data than RAID1 mirror sets but there is a performance hit due to the overhead.

The actual mathematical calculation of the parity data is done as follows:

The parity is generated by grouping together the bits (0 an 1s) to be written, then flipping the bits for the parity data by making the 1 bits 0s and the 0 bits 1s. (This is an XOR operation.) A bit (0 or 1) is then added to each group of data bits so that it will have either an odd or even number of 1s. When reading back the parity, if the parity that was generated was odd, then any group of bits that arrives with an even number of 1s for that data must be in error. The data can then be regenerated using the parity information from the other disks.

Some storage array controllers can eliminate most of the RAID5 write penalty by gathering writes in mirrored cache, calculating the parity only once for all the data gathered, then writing it down to disks. This reduces the number of reads and writes that need to occur for multiple write streams which is very efficient for streaming write data like transaction logs for databases.

The best place I found to reference this stuff, is from a book called "The RAID Book", which was published by Digital press a few years back.

Chris

Editor's note: Do you agree with this expert's response? If you have more to share, post it in one of our .bphAaR2qhqA^[email protected]/searchstorage>discussion forums.

## SearchDisasterRecovery

• ### Establish a business continuity team to get the full picture

Business continuity teams provide insight and focus that can keep an organization on its feet when disaster strikes. Don't rely ...

• ### Castellan Solutions says COVID-19 forced BCDR rethink

COVID-19 caught organizations flat-footed in terms of business continuity, and new challenges will arise as vaccines are ...

• ### Create an incident response plan with this free template

Want to boost your organization's ability to fight cybersecurity threats? Uncover the essentials to creating an incident response...

## SearchDataBackup

• ### Zerto targets cloud data protection at ZertoCON 2021

Zerto rolled out three products to protect SaaS, Kubernetes and AWS at its ZertoCON virtual event and gave a preview of new ...

• ### Druva gets \$147M boost, brings valuation to more than \$2B

The additional money will help Druva expand its cloud platform to new geographies and keep up with growing customer demand for ...

• ### Look for these critical features in a ROBO backup tool

To effectively back up ROBO environments, admins should prioritize bandwidth optimization, centralized management and the ability...

## SearchConvergedInfrastructure

• ### VMware vSAN storage update eyeballs HCI scaling challenges

The VMware vSAN storage update aims to help enterprises start with a small HCI deployment. Customers gain the option to connect ...

• ### HPE SimpliVity gains cloud backup and Kubernetes CSI plug-in

SimpliVity has added integration with HPE Cloud Volumes Backup and HPE StoreOnce to enable easier backup at the edge, as well as ...

• ### Top hyper-converged systems and composable infrastructure of 2020

Dell EMC, Scale Computing and SmartX win praise for multifaceted, edge-friendly and high-performance applications, respectively, ...

Close