How does RAID5/parity really work? Does the parity block act as a temporary buffer? What exactly is going on here (XOR) wise? Also could you point me to a reference?
RAID5 works by gathering up three or more physical disk drives (usually from 3 to 32 but you usually see 15 as the max in most RAID arrays) and logically combining them into a single logical entity.
RAID5 uses "distributed parity." This means as parity information is generated for each write command to the logical disk combined into a RAID5 "set", the parity data is "striped" across all the disks in the set.
Parity for a RAID5 set is generated, usually in hardware by the RAID controller, by a process known as: "Read, modify, write." On a RAID5 set, a write operation is not complete until the following process occurs:
1. Read old data from the disks
2. Read the old parity from the disks and calculate the difference between the old data and the new data to be written, using an XOR operation
3. Write the new data to disks
4. Write the new parity to disks
This means that for every write operation, four operations actually happen. This is known as the "RAID5 write penalty." RAID5 is a tradeoff between speed, reliability and disk space. RAID5 has an advantage of using less disk space for parity data than RAID1 mirror sets but there is a performance hit due to the overhead.
The actual mathematical calculation of the parity data is done as follows:
The parity is generated by grouping together the bits (0 an 1s) to be written, then flipping the bits for the parity data by making the 1 bits 0s and the 0 bits 1s. (This is an XOR operation.) A bit (0 or 1) is then added to each group of data bits so that it will have either an odd or even number of 1s. When reading back the parity, if the parity that was generated was odd, then any group of bits that arrives with an even number of 1s for that data must be in error. The data can then be regenerated using the parity information from the other disks.
Some storage array controllers can eliminate most of the RAID5 write penalty by gathering writes in mirrored cache, calculating the parity only once for all the data gathered, then writing it down to disks. This reduces the number of reads and writes that need to occur for multiple write streams which is very efficient for streaming write data like transaction logs for databases.
The best place I found to reference this stuff, is from a book called "The RAID Book", which was published by Digital press a few years back.
Editor's note: Do you agree with this expert's response? If you have more to share, post it in one of our .bphAaR2qhqA^0@/searchstorage>discussion forums.
This was first published in November 2002