Parity functions

RAID 5 and XOR

RAID 5 uses bitwise "exclusive OR" (XOR) function to compute the parity values from the array data. The XOR function satisfies two important conditions:

  1. If A xor B = C, then A = C xor B, and also B = C xor A.
  2. If A and B occupy the same number of bits, C also occupies that number of bits.

A xor B = C

A B C
0 0 0
1 0 1
0 1 1
1 1 0

Using these properties of XOR function allows one to calculate one of the missing values given all the other.

RAID6 and Reed-Solomon code

RAID 6 uses two different functions to calculate the parity. This is because the results of XOR function do not depend on the position of the original data:
1 XOR 0 = 1,
0 XOR 1 = 1, and,
in general, P(A,B) = P(B,A).

For a RAID6 it is not enough just to add one more XOR function. If two disks in a RAID6 array fail, it is not possible to determine data blocks location using the XOR function alone. Thus in addition to the XOR function, RAID6 arrays utilize Reed-Solomon code that produces different values depending on the location of the data blocks, so that Q(A,B) ≠ Q(B,A).

Parity Placement

When there are one or two parity functions which should be placed on several disks, several different patterns can arise. The trivial solution is to put parity only on one disk. In such case we get a RAID 4 which has a low write performance because each write operation affects a disk with parity, which becomes a bottleneck. It is much more efficient to place the parity evenly on all disks. In case of one parity there is only left and right rotation. When two parity functions are used, one more parameter is required - position of first parity function relative to the second one, that is, it is required to determine what the parity is left and what is right.

QP
QP
QP
P Q
PQ
PQ
PQ
Q P

It should be noted that there is no guarantee that the parity will always move by one column per row. In case of RAID5 it's pretty pointless to move the parity more than by one, so it is unlikely that you will ever deal with such configuration. In case of RAID6 such configuration does exist and is called wide pace. For example, Promise controllers use this parity layout.

PQ
PQ
PQ
PQ

Additionally, there is no guarantee that the size of parity block is the same as the size of data block. If the parity block is larger than block with data we get an array with so called delayed parity which is used in HP Smart Array controllers.

12P
34P
6P5
8P7
P910
P1112

MS Storage Spaces has another interesting specific - it changes the disk order arbitrarily over the long intervals, about hundreds of megabytes.

Note: The bottleneck of RAID4 is a disk storing parity, however it is so only if you deal with the identical disks. If you replace the parity disk with SSD or with an array of several small SSD disks, the disadvantage turns into an advantage, and you can get a relatively cheap array with RAID0 performance and with RAID5 fault tolerance.

Continue to "Write hole".