RAID
RAID (Redundant Array of Inexpensive Disks )
RAID, short for Redundant Array of Inexpensive Disks or Redundant Array of Independent Disks is a data storage methodology which makes use of multiple hard drives or disks to share the data among the various disks. The advantages which RAID provides are an increase in throughput of the system, its capacity and data integrity. Also, it is possible to use multiple low-cost devices to achieve greater reliability, speed and capacity. RAID treats many hard drives as a single unit. RAID finds use in servers.
The original RAID specification suggested a number of prototype "RAID levels", or combinations of disks. Each had theoretical advantages and disadvantages. Over the years, different implementations of the RAID concept have appeared. Most differ substantially from the original idealized RAID levels, but the numbered names have remained. This can be confusing, since one implementation of RAID 5, for example, can differ substantially from another. RAID 3 and RAID 4 are often confused and even used interchangeably.
History of RAID
What are now parts of various RAID “levels” were part of a 1978 US Patent which was granted to Norman Ken Ouchi of IBM. This was called “System for recovering data stored in failed memory unit”. The patent has described various techniques such as duplexing and dedicated parity protection etc. which are part of various RAID implementations of today.
However, RAID, as a technology, was first laid down or defined by some computer scientists at the University of California, Berkeley in 1987 while analyzing the possibility of using more than one device on a computer and making it look like a single drive. Then, in 1988, David A. Patterson, Garth A. Gibson and Randy H. Katz put out a paper called “A Case for Redundant Arrays of Inexpensive Disks (RAID)” which properly defined the RAID levels 1 to 5.
Hardware and Software Implementations of RAID
RAID technology can be implemented in two ways, namely, hardware and software. Also, both implementations may be used together to form a hybrid variety of RAID implementation.
In a software implementation, the disks connected to the system are managed with the help of the usual drive controllers such as SATA, SCSI etc. Since CPU speeds are now very high, a software RAID implementation is faster than a hardware one. However, a major disadvantage here is that the CPU power is compromised for the RAID implementation. Sometimes, a hardware implementation of RAID may make use of a battery backed-up write back cache which can speed up many applications. In such a condition, the hardware implementation will prove to be faster. It is for this reason that a hardware RAID implementation is seen as apt for database servers. A software implementation will also refuse to boot until the array of disks is restored after a disk in the array fails completely.
A solution to the above problem is to have a preinstalled hard drive ready for use after a disk failure. The implementation, be it hardware or software, would immediately switch to using this drive as soon as a disk in the array fails. This technique is called ‘hot spare’. When implementing RAID through hardware, it becomes necessary to use a RAID controller which can be a PCI card or a part of the motherboard of the system itself. The function of controller is to manage the disks in the array and perform any calculations that maybe required. Hardware implementations, due to the hardware controllers they use, allow what is called ‘hot swapping’ where any failed disks can be replaced even if the system is running. Modern hybrid RAID implementations do not have a special-purpose controller and instead uses the normal controller of the hard drive. The software part of such an implementation can then be activated by the user from within the BIOS and it is operated by the BIOS then onwards.