disk striping

Contributor(s): Carol Sliwa

Disk striping is the process of dividing a body of data into blocks and spreading the data blocks across multiple storage devices, such as hard disks or solid-state drives (SSDs). A stripe consists of the data divided across the set of hard disks or SSDs, and a striped unit, or strip, that refers to the data slice on an individual drive.

Storage systems vary in the way they perform data striping. For instance, a system may stripe data at the byte, block or partition level, and it can stripe data across all or only some of the disks in a cluster. For instance, a storage system with 10 hard disks might stripe a 64 KB block on the first, second, third, fourth and fifth disks and then start over again at the first disk. Another system might stripe 1 megabyte (MB) on each of its 10 disks before returning to the first disk to repeat the process.

Pros and cons of disk striping

The main advantage of disk striping is higher performance. For example, striping data across three hard disks would provide three times the bandwidth of a single drive. If each drive runs at 200 input/output operations per second (IOPS), disk striping would make available up to 600 IOPS for data reads and writes.

The disadvantage of disk striping is low resiliency. The failure of any physical drive in the striped disk set results in the loss of the data on the striped unit, and consequently, the loss of the entire data set stored across the set of striped hard disks.

Disk striping and RAID

Redundant array of independent disks (RAID) uses disk striping to distribute and store data across multiple physical drives. Disk striping is synonymous with RAID 0 and spreads the data across all the disk drives in a RAID group without parity. Disk striping without parity is not fault tolerant.

Disk striping without RAID may be used for temporary data, scratch space, or in situations where a master copy of the data is easily recoverable from another storage device.

Disk striping with parity

To address the potential for data loss with RAID 0, a RAID set typically uses at least one stripe for parity. The parity information is commonly calculated by using the binary exclusive or (XOR) function and stored on a physical drive in the RAID set. If a storage drive in the striped RAID set fails, the data is recoverable from the remaining drives and the parity stripe.

For a data set with n drives, the data might be striped on drives n through n-minus-1, and the nth drive would be reserved for parity. For example, in a RAID set with 10 drives, data could be striped to nine drives, and the 10th drive would be used for parity.

Disk striping with RAID provides redundancy and reliability. RAID 4 and RAID 5 protect against a single drive failure. RAID 6 uses two drives for parity and protects against two drive failures. Data protection can be extended beyond two storage device failures using erasure coding.

One disadvantage of disk striping with parity is the performance penalty for small random writes, as the system accesses all the stripe units in the striped RAID set.

Disk striping and disk mirroring

Disk striping can be combined with disk mirroring, or RAID 1, to speed performance and expand capacity by striping data across multiple sets of mirrored drives. The disadvantage of disk striping with mirroring is the 50% overhead inherent in using half the capacity to make an exact copy of the data for protection.

This was last updated in March 2015

Continue Reading About disk striping

Dig Deeper on SAN technology and arrays

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

Will disk striping without parity or with RAID make more sense for your data set?
We only use disk striping without parity in limited instances, for obvious reasons. Most of our data is way too valuable to risk, and our IOPS requirements aren't that intense. We'd rather spend the money for redundancy and have the data, that not do it and risk any data loss. Even for our transactional applications that's the case, because we have to keep records of all transactions, and we get fined if we don't comply.
I join Mr timethy2 in his claim cause critical data such as transaction records or employees information to name but a few should be stored in safe and well protected storage places. However, performance and redundancy are also important subjects to consider especially when you're dealing with great amount of data. So, I think that the choice to make is a serious one and can reorient the whole enterprise's future.      
I have to say it's difficult for me to see a use case where performance is more important than resiliency. All the performance in the world isn't going to help if you end up not being able to read the data afterwards.


File Extensions and File Formats

Powered by: