RAID 5 overview

May 30, 2009

RAID 5 is a method of spreading volume data across multiple disk drives. The DS6000™ series supports RAID 5 arrays.

RAID 5 increases performance by supporting concurrent accesses to the multiple DDMs within each logical volume. Data protection is provided by parity, which is stored throughout the drives in the array. If a drive fails, the data on that drive can be restored using all the other drives in the array along with the parity bits that were created when the data was stored.

One of the most popular RAID levels, RAID 5 stripes both data and parity information across three or more drives. It is similar to RAID 4 except that it exchanges the dedicated parity drive for a distributed parity algorithm, writing data and parity blocks across all the drives in the array. This removes the "bottleneck" that the dedicated parity drive represents, improving write performance slightly and allowing somewhat better parallelism in a multiple-transaction environment, though the overhead necessary in dealing with the parity continues to bog down writes. Fault tolerance is maintained by ensuring that the parity information for any given block of data is placed on a drive separate from those used to store the data itself. The performance of a RAID 5 array can be "adjusted" by trying different stripe sizes until one is found that is well-matched to the application being used.


This illustration shows how files of different sizes are distributed
between the drives on a four-disk RAID 5 array using a 16 kiB stripe
size. As with the RAID 0 illustration, the red file is 4 kiB in size; the blue
is 20 kiB; the green is 100 kiB; and the magenta is 500 kiB, with each
vertical pixel representing 1 kiB of space. Contrast this diagram to the
one for RAID 4, which is identical except that the data is only on three
drives and the parity (shown in gray) is exclusively on the fourth.drive.

Controller Requirements: Requires a moderately high-end card for hardware RAID; supported by some operating systems for software RAID, but at a substantial performance penalty.

Hard Disk Requirements: Minimum of three standard hard disks; maximum set by controller. Should be of identical size and type.

Array Capacity: (Size of Smallest Drive) * (Number of Drives - 1).

Storage Efficiency: If all drives are the same size, ( (Number of Drives - 1) / Number of Drives).

Fault Tolerance: Good. Can tolerate loss of one drive.

Availability: Good to very good. Hot sparing and automatic rebuild are usually featured on hardware RAID controllers supporting RAID 5 (software RAID 5 will require down-time).

Degradation and Rebuilding: Due to distributed parity, degradation can be substantial after a failure and during rebuilding.

Random Read Performance: Very good to excellent; generally better for larger stripe sizes. Can be better than RAID 0 since the data is distributed over one additional drive, and the parity information is not required during normal reads.

Random Write Performance: Only fair, due to parity overhead; this is improved over RAID 3 and RAID 4 due to eliminating the dedicated parity drive, but the overhead is still substantial.

Sequential Read Performance: Good to very good; generally better for smaller stripe sizes.

Sequential Write Performance: Fair to good.

Cost: Moderate, but often less than that of RAID 3 or RAID 4 due to its greater popularity, and especially if software RAID is used.

Special Considerations: Due to the amount of parity calculating required, software RAID 5 can seriously slow down a system. Performance will depend to some extent upon the stripe size chosen.

Recommended Uses: RAID 5 is seen by many as the ideal combination of good performance, good fault tolerance and high capacity and storage efficiency. It is best suited for transaction processing and is often used for "general purpose" service, as well as for relational database applications, enterprise resource planning and other business systems. For write-intensive applications, RAID 1 or RAID 1+0 are probably better choices (albeit higher in terms of hardware cost), as the performance of RAID 5 will begin to substantially decrease in a write-heavy environment.

Related Posts

Next Article
« Prev Post
Previous Article
Next Post »

No comments