RAID originally stood for redundant array of inexpensive disks. Today, the acronym has been updated to redundant...
array of independent disks. But RAID's purpose hasn't changed.
RAID is a common method for protecting application data on hard disk drives and solid-state storage, with different types of RAID balancing the level of protection against price. The greater the protection, the higher the cost. As data storage has evolved, the number of RAID levels has increased.
RAID is a way of grouping individual physical drives together to form a RAID set. The RAID set represents all the physical drives as one logical disk on your server. The logical disk is called a logical unit number, or LUN.
Improvements to RAID performance and availability have kept it in use even as newer, alternative technologies have become available. Erasure coding and SSDs have presented reliable -- if more expensive -- alternatives, and as storage capacity increases, the chance of RAID array errors increases, too. Still, storage vendors continue to support RAID levels in their storage arrays.
To fully understand RAID and its benefits, it's important to break down the different RAID levels and what they each do best.
- RAID 0: Disk striping
- RAID 1: Disk mirroring
- RAID 1+0: Disk mirroring and striping
- RAID 2: Striping and Hamming code parity
- RAID 3: Parity disk
- RAID 4: Parity disk and block-level striping
- RAID 5: Disk striping with parity
- RAID 5+0: Disk striping and distributed parity
- RAID 6: Disk striping with double parity
- Adaptive RAID: Option to use RAID 3 or RAID 5
- RAID 7: Non-standard with caching
RAID levels explained
RAID levels can be broken into three categories: standard, nonstandard and nested. Standard levels of RAID are made up of the basic types of RAID numbered 0 through 6. A nonstandard RAID level is set to the standards of a particular company or open source project. Nonstandard RAID includes RAID 7, adaptive RAID, RAID S and Linux md RAID 10. Nested RAID refers to combinations of RAID levels, such as RAID 10 (RAID 1+0) and RAID 50 (RAID 5+0).
The RAID level you use should depend on the type of application you're running on your server. RAID 0 is the fastest, RAID 1 is the most reliable and RAID 5 is a good combination of both. The best RAID for your organization may depend on the level of data redundancy you're looking for, the length of your retention period, the number of disks you're working with and the importance you place on data protection versus performance optimization.
Below is a description of the different RAID types most commonly used in storage arrays. Not all storage array vendors support every RAID type, so be sure to check with your vendors for the types of RAID that are available with their data storage.
RAID 0: Disk striping
RAID 0 is simple disk striping. All the data is spread out in chunks across all the SSDs or HDDs in the RAID set. RAID 0 offers great performance because you spread the load of storing data onto more physical drives. RAID 0 doesn't make use of disk parity, which is a way to make sure data has successfully been written when it is moved from one drive to another. Because RAID 0 doesn't make use of parity, it doesn't have data redundancy or fault tolerance.
Advantages. Performance is RAID 0's key advantage. Striping data across multiple disks provides more bandwidth than a single disk drive, multiplying the number of IOPS available for data reads and writes. RAID 0 is easy to implement and has the lowest cost of all the RAID types because it uses disk space only to store data. It's widely supported, and because there's no parity generated for RAID 0, there is no overhead to write data to RAID 0 disks.
Disadvantages. RAID 0 has the worst data protection of all the RAID levels. Because RAID 0 doesn't have parity, when a disk fails, data on that disk is unavailable until it can be rewritten from another drive.
Best use. RAID 0's lack of redundancy means it should be used for data storage for non-mission-critical applications. It's well suited to applications where data is read and written at high speeds.
RAID 1: Disk mirroring
RAID 1 uses disk mirroring, which means all the data is written to two separate physical disks. The disks are essentially mirror images of one another. If a single disk fails, data can be retrieved from the other disk. RAID 1 requires a minimum of two disk drives.
Advantages. Disk mirroring is good for fast read operations. RAID 1 is also useful for disaster recovery situations, because it provides instantaneous failover. If the primary drive becomes inoperable, the secondary, mirrored drive can take over because the data, operating system and application software are replicated there.
Disadvantages. Write speeds are slower because data must be written twice to disks. Another downside of RAID 1 is it doubles the amount of disk space required because all the data is stored twice.
Best use. RAID 1 works well for high-performance and high-availability applications, including email, operating systems and transactional applications. Its instantaneous failover capability makes it a good choice for mission-critical applications.
RAID 1+0: Disk mirroring and striping
RAID 1+0, which is also called RAID 10, is a nested RAID level that combines disk mirroring and striping. The data is normally mirrored first and then striped. Mirroring striped sets accomplishes the same task, but it's less fault tolerant than striping mirror sets. RAID 1+0 requires a minimum of four physical disks.
Advantages. RAID 10 benefits from the performance capabilities provided through its use of RAID 0. Data is spread across two or more drives, and multiple read/write heads on the drives can access portions of the data simultaneously, resulting in faster processing. Because it uses RAID 1, RAID 10 data is fully protected. If the originating drive fails or is unavailable, the mirror copy can take over.
Disadvantages. If you lose a drive in a stripe set, you must access data from the other stripe set because stripe sets have no parity. In using RAID 1, RAID 10 fully duplicates data, doubling the amount of data storage capacity required. And, with a minimum of four disks required, RAID 10 is more expensive than other RAID levels.
Best use. RAID 10's redundancy and high performance make it a good choice for operations that require minimal downtime. It's also optimal for I/O-intensive applications, such as email, web servers, databases and applications that need high-disk performance.
RAID 2: Striping and Hamming code parity
RAID 2 stripes data at the bit level and uses Hamming code to provide parity and detect errors. Parity provides a checksum of the data written to disks. Parity information gets written along with the original data. The server accessing the data on a hardware-based RAID set never knows when one of the drives in the RAID set has gone bad. When that happens, the controller uses the parity information stored on the surviving disks in the RAID set to recreate the data that was lost.
Advantages. Data protection is a key advantage of RAID 2. The parity provided by the Hamming code delivers data redundancy and fault tolerance.
Disadvantages. RAID 2 is more complex than other RAID levels. It also is more costly than some other levels because it requires an additional disk drive.
Best use. These days, Hamming codes are already used in the error correction code found in hard drives, so RAID 2 is no longer used.
RAID 3: Parity disk
RAID 3 uses a parity disk to store the parity information generated by a RAID controller on a separate disk from the actual data disks instead of striping it with the data, as in RAID 5. RAID 3 requires a minimum of three physical disks.
Advantages. RAID 3 provides high throughput, making it a good choice for transferring large amounts of data in bulk.
Disadvantages. RAID 3 requires an extra drive for parity. With the parity data stored on a separate disk, RAID 3 performs poorly when there are a lot of small requests for data, as with a database application.
Best use. RAID 3 performs well with applications that require one long, sequential data transfer, such as video servers.
RAID 4: Parity disk and block-level striping
RAID 4 uses a dedicated parity disk along with block-level striping across disks to protect data. With RAID 4, the number of bits on multiple disks is added together, and the total is kept on the separate parity disk. Those stored bits are used to help with data recovery when a dive fails.
Advantages. Striping enables data to be read from any disk. RAID 4 is good for sequential data access.
Disadvantages. The use of a dedicated parity disk can cause performance bottlenecks for write operations, because all writes must go to the dedicated disk.
Best use. With alternatives such as RAID 5 now available, RAID 4 isn't used much.
RAID 5: Disk striping with parity
RAID 5 uses disk striping with parity. Like other RAID levels that use striping, the data is spread across all the disks in the RAID set. The parity information needed to reconstruct the data in case of disk failure is also spread diagonally across disks in the RAID set. RAID 5 is the most common RAID method because it achieves a good balance between performance and availability. RAID 5 requires at least three physical disks.
Advantages. The combined use of data striping and parity prevents any single disk from becoming a bottleneck. RAID 5 provides good throughput and performance equal to RAID 0. With parity data spread across all the drives in the RAID set, RAID 5 is one of the most secure RAID types, providing data redundancy and reliability. RAID 5 drives can be hot swapped, eliminating downtime.
Disadvantages. Write performance on RAID 5 drives is slower than read performance because of the parity data calculation. This RAID level also suffers from longer rebuild times and potential data loss if a second drive fails during a rebuild. RAID 5 also requires a more sophisticated controller than other RAID levels.
Best use. RAID 5 is a good option for application and file servers with a limited number of drives.
RAID 5+0: Disk striping and distributed parity
RAID 5+0, also known as RAID 50, is another nested RAID level that combines striping and distributed parity to get the advantages of both. RAID 50 has a six-disk minimum requirement.
Advantages. RAID 50 provides faster write performance than RAID 5. Its data protection features are also a step above RAID 5, and its rebuild time is faster. In the event of a drive failure, performance isn't degraded as much as with RAID 5, because only one of the RAID 5 arrays is affected.
Disadvantages. RAID 50's six-disk requirement makes it potentially more expensive than other RAID types. And, like RAID 5, it also needs a more sophisticated controller and synchronized disks.
Best use. RAID 50 is good for applications requiring high reliability and ones that handle high-request and data transfer rates.
RAID 6: Disk striping with double parity
RAID 6 increases reliability by spreading data across multiple disks and enabling I/O operations to overlap to improve performance. RAID 6 uses two parity stripes, which allow for two disk failures within the RAID set before data is lost. RAID 6 enables data recovery during simultaneous drive failures, which is more common with larger capacity drives with longer rebuild times. RAID 6 requires at least four drives.
Advantages. The dual parity provided with RAID 6 protects against data loss if a second drive fails. The percentage of usable data storage capacity increases as disks are added to a RAID 6 array. Beyond the minimum of four disks, RAID 6 uses less storage capacity than RAID levels that use mirroring.
Disadvantages. RAID 6 has lower performance than RAID 5. Performance can take a significant hit if two drives need to be rebuilt at the same time. RAID 6 can be more expensive, because it requires two extra disks for parity. RAID 6 requires a specialized controller, and RAID controller coprocessors are often used with RAID 6 to do parity calculations and improve write performance.
Best use. RAID 6 is a good option for long-term data retention. It's often used for large-capacity drives deployed for archiving or disk-based backup. With more data protection capabilities than RAID 5, RAID 6 is also a good choice for mission-critical applications.
Adaptive RAID: Option to use RAID 3 or RAID 5
Adaptive RAID lets the RAID controller figure out how to store parity on the disks. It chooses between RAID 3 and RAID 5 depending on which RAID set type will perform better with the type of data being written to the disks.
RAID 7: Nonstandard with caching
RAID 7 is a nonstandard RAID level -- based on RAID 3 and RAID 4 -- that adds caching and requires proprietary hardware. This RAID level is owned and trademarked by the now-defunct Storage Computer Corp.