Practical Technology

for practical people.

RAID!

April 1st, 1994 · No Comments

Two well-known laws of computing are that there’s no such thing as enough disk space and no hard drive is fast enough. If you get a gigabyte sized disk, within a year you’ll have more than a gigabyte of data. You’ll also want megabytes of that data in memory faster than your system can provide it. At least nowadays, you can get disks in gigabyte sizes. Unfortunately, their data-transfer speeds are not much different from their 40 MB relatives. A new technology called RAID may change all of that forever.

SLED

One problem with conventional mass storage is that single large expensive disks (SLEDs) are, as the name suggests, pricey. It’s not easy making conventional drives with the tolerance levels that can handle King-Kong sized data loads. This translates into high development and construction costs.

A more important problem with SLEDs is that purely mechanical considerations hinder their output. While CPU and memory speeds continue to improve at a remarkable rate, the raw speeds of secondary storage devices are improving at a much more modest rate.

Most users haven’t noticed that their hot new processors are outracing their disks. That’s because caching makes it possible to hide hard drives’ comparative slowness. Both dynamic RAM (DRAM) and static RAM (SRAM) have gone down in price and up in performance. These trends mean that users to have the room needed to use caching software effectively and vendors can produce reasonably priced caching disk controllers.

Caching disguises the problem, but does nothing to cure it. Higher hard disk data density helps by improving transfer rates, but not enough. SLEDs are still hamstrung by the need to move a drive’s heads mechanically to seek data and the delays caused by disk rotation.

RAID

Fortunately, there is another way of getting gigabyte-sized storage and good performance. This method is to take an array of inexpensive disks and attach them to a computer so that the computer views the array as a single drive. It’s simple, it’s slick, and it works. Called RAID, for redundant arrays of inexpensive disks, it’s going to change the way you buy file-server and workstation storage.

RAID is an old and dusty way of viewing mass storage. As our hunger for more space and higher speeds grows, RAID has become more prominent. Equally important, David Patterson, Garth Gibson, and Randy Katz in their seminal paper, “A Case for Redundant Arrays of Inexpensive Disks,” gave computer designers a RAID taxonomy. By defining and classifying as levels the ways that an array of disks can be used to improve performance, Patterson, and his fellow researchers, opened a new vista of mass storage technology.

Pluses

RAID provides several benefits. The first is the best. RAID systems have the potential to deliver vastly increased data transfer rates. In theory, the input/output transmission rate of a RAID system can be more than ten times greater than a SLED.

RAID pulls this trick off by “striping” data across the array’s disks. In English, this means that a file can be distributed across the array so that it can be read or written much more quickly than on a SLED. For instance, a file can be placed so that while the first part of it is being read from disk one of the array, the second portion is already being picked up from disk two.

By enabling parallel data transfers, data throughput can be multiplied by the number of drives in the array. For example, a four disk RAID could have four times the throughput of an equal-sized SLED. The resulting increase in bandwidth is largely what gives RAID systems their performance kick. The same mechanical factors that slow down SLEDs drag the theoretical performance benefits of RAID back to earth. Nevertheless, RAID designs are still inherently faster then SLED designs.

Another plus for RAID designs is that the right kind of RAID can handle multiple small read requests. This can vastly increase the effective speed of disks used in network servers. ;;;In practice, many considerations can drop the RAID performance edge to a less impressive level. Some RAID levels are not well suited for network operating systems (NOS) or multiuser operating systems.

For example, a network file-server with dozens of users requiring access to data scattered hither and yon across the disks seems tailor made for RAID. And, it is, if it’s the right kind of RAID. A RAID implementation that’s meant for large sequential data reads and writes simply won’t cut it on a file-server. Such a RAID controller might work well on a dedicated database engine, but there’s little else in the microcomputer world where such designs can play a role. You must be certain that any particular RAID design fits your needs.

Another concern that limits RAID’s power boost is that operating systems like MS-DOS, OS/2 and most flavors of Unix require every block of a file to be on one drive. This cuts out RAID’s ability to improve throughput by accessing multiple drives concurrently.

The moral of the story is that the operating system can determine how effective a RAID really is. For the maximum in RAID benefit, the drives should be coupled with an operating system like Novell Netware or some types of Unix that can distribute a file block’s across the entire array rather than one disk. Another alternative is a RAID controller that can fool the operating system into thinking that the RAIDed disks are one remarkably large disk.

The second RAID advantage is that these drives should prove cheaper than their SLED equivalents. Note that I say, ‘should.’ RAID technology is just emerging from the starting gate, and to date, their prices are low. At this point, we’re still paying for RAID research and development.

Minuses

It’s real easy to see RAID’s Achilles’ heel. RAID gets its performance and megabyte for the dollar bang from putting multiple cheap disks into a single logical array. Now, to find the Mean Time to Failure (MTTF) for that array simply take the MTTF of one disk and divide it by the total number of disks in the array. For instance, many hard drives have a MTTF of 20,000 hours. That’s not bad. Now, in a RAID with 10 of these drives, the MTTF drops to an appalling 2,000 hours. In other words, you can be reasonably sure that the RAID will fail within a year of normal business use. Ouch.

Luckily, there is a way of getting around that painful MTTF: fault tolerance. This key concept unlocks RAID’s potential. Fault tolerance’s importance to RAID can be judged by the fact that it’s what Patterson used to define RAID’s levels.

SUBHED: Levels of RAID

The first, or zero, layer of RAID doesn’t use fault-tolerance at all. The zero level relies upon high MTTF’s on each drive to protect the RAID from disaster. Systems built around this idea tend to be faster than a bat out of hell. Cynics might say that they’re about as reliable.

RAID 0 drives, if carefully designed, can work quite well. The secret is to have only a few drives in the array and to make certain that these drives are highly reliable. This brings these drives MTTF to industry-acceptable levels.

MicroNet Technology takes this approach with their Macintosh specific Raven disk array storage system. The Raven SBT-1288NPR uses a pair of Seagate WrenRunner 2 644 MB drives and two MicroNet’s NuPORT SCSI-2 host adapters.

The result is one of the fastest drives ever for a Macintosh. The WrenRunner 2 drives run at 5,400 RPM, about 50 percent faster than conventional drives. Combine this with the RAID advantage and MicroNet’s ‘Overlapping Seek’ data search algorithms and you get a gigabyte plus of storage with access times that dip as low as 6 milliseconds. Even more impressive, the SBT-1288NPR can sustain 4.4 MBs per second data transfers. Can you say “Zoom?”

High-end workstations can make the best use of Level 0 RAIDs. Level 0 sub-systems would not work well on file-servers where the limited number of disks in an array would limit the speed benefits obtainable from multiple read requests.

First Level

Level 1 RAIDs rely upon that old stand-by of fault tolerance, disk mirroring, to protect their data. Level 1 RAIDs are safe, sure, and easy to make. There are probably more Level 1 RAID designs now in production than any other. Disk mirroring, however, means that only half a RAID’s maximum disk space can be used for data storage. That’s too high a price for safety for many users.

Missing disk space is the most clear cut problem with disk mirroring but it’s not the only one. Disk mirroring also slows I/O because of the need to read and write from two disks. In uni-processor systems with unintelligent controllers, the I/O performance drop can be as bad as 50 percent. Multiprocessors and controllers with onboard processors go a long way to removing this performance string.

Are you willing to pay the price of half your disk’s room for RAID’s performance benefits? RAID vendors are betting that you are.

There are two reasons why these companies are making this bet. First, disk mirroring is cheap from a production point of view. Any controller that can handle multiple active drives can be put into service as a RAID 1 controller with the proper software driver. While this keeps ST-506 and ESDI controllers out, they can handle only one active drive at a time, SCSI controllers have little trouble being retrofitted into RAID Level 1 controllers.

The other reason is that RAID Level 1 systems are not really competing with SLEDs. Users currently using disk mirroring are the customers for Level 1. From this perspective, Level 1 makes a great deal of sense. Level 1 RAIDs are faster than pure disk mirroring systems, but provide the same safety net.

Network administrators probably will find RAID 1 an attractive alternative. It’s inexpensive and provides the benefits of disk mirroring with less of a performance hit.

Novell Netware 386, the most popular NOS, supports Level 1 RAID. Data General’s DG/UX operating system for their AViiON workstations also enables Level 1 RAID. Other workstation vendors that are positioning their machines as network servers will be adding RAID 1 to their offerings. By the time this sees print, there is no doubt that other vendors will be producing RAID 1 software.

Second Level

The least interesting RAID Level for microcomputer users is Level 2. At this level, data is safeguarded by bit-interleaving data across the entire disk array with Hamming error-correction codes. This takes up less room than disk mirroring, but that’s about its only virtue for small computer users.

Several disks must be assigned as check disks to store error correction codes. Level 2 eats up about 40% of available disk space. Level 2 also requires the controller or CPU to be constantly generating error-correction codes. Worse yet from a performance standpoint, every disk in the array must be accessed for a single data read or write.

All of this makes Level 2’s small-file data transfer rate, in a word, awful. An unadorned Level 2 array simply isn’t suitable for PC or file-server use. No one makes Level 2 RAIDs for PCs, if anyone did, no one should buy them.

Third Level

Another of Level 2’s problems is that its check disks are redundant. It’s a simple enough task to enable a disk controller to be able to tell when a specific drive in an array has failed. Even detecting sector failures isn’t that much trouble. Level 3 uses the idea that information on a failed disk or sector can be restored with a single check disk.

Level 3 guards against data loss by parity checking. Level 3 error-correction works by calculating a parity value for each byte. In parity checking, an extra bit holds the parity value for each byte. Systems that use ‘even’ parity checking have a ‘1’ as the parity bit if the sum of the numbers in the byte is an even number. If the sum is odd, then the parity bit is ‘0.’

How does this work to restore data if a disk in a RAID bites the big one? Each byte’s parity in the intact data disks and the check disk can be used to determine a new parity. This is compared to the parity of the array before the failure. If the parities are not the same, then the lost bit was a ‘1’, otherwise the missing bit was a ‘0.’

Besides being a neat way of restoring data, this means that up to 85% of the array’s space can be available for storage. Level 3 gives more storage room to users than either Levels 2 or 1.

That’s the good news, the bad news is that Level 3 has some of Level 2’s I/O woes. Unlike Level 2, reads can be made at high speed. Writes are another story. Every time data is written to a disk, either the CPU or a controller processor must generate a new parity value.

This really puts a load on the processor. Even a 50 MHz 486 would show signs of overwork in a transaction heavy environment. In practical terms, Level 3 should not be implemented in software destined for any PC’s main processor.

The need to write the parity values to the check disk also slows Level 3 designs. If that wasn’t enough, Level 3 can only perform a single I/O transaction at a time. Level 3 works fine for large data block transfers. Like Level 2, though it’s really not well suited for LAN, multiuser, or workstation use.

Fourth Level

The primary difference between Level 3 and Level 4 is the level of data interleave and parity checking. In Level 4, data is interleaved between disks by sector instead of by bits.

The results are faster data reads because several reads can be conducted at once if the reads aren’t to the same disk. Write speeds are still hampered because the parity drive must be updated every time there’s a write. Overall effective performance is dramatically better than RAIDS 1 through 3 though. That’s because reads make up the vast majority of primary storage interactions.

Level 4 small data transfer I/O also gets a kick in the pants because the parity calculation is simpler. Level 3’s parity calculations are not difficult, but they are processor killers because every disk in the array must be consulted. Level 4 sidesteps this. In Level 4, only the values of the old data, the new data, and the old parity are used to calculate parity. Write operations take up far less time. Unfortunately, only a single write can be done at a time.

The right combination of cache and intelligent controller can overcome this slowdown. Dell Computers takes this approach with their Dell Drive Array (DDA).

SUBHED: DDA. The DDA starts with a high-performance 32-bit EISA disk controller. This controller uses a dedicated 16 MHz Intel RISC 960 microprocessor to generate parity values. The processor controls both data access and layout. In turn, the i960 gets its marching orders from instructions stored in 512K of 32 bit firmware ROM. These instructions are supplemented by optional dynamically loaded firmware that can be loaded in a 256K Static RAM (SRAM) storage area. This means that when Dell improves the RAID’s code, the new and improved firmware can be loaded as software. Using DDA means never having to say you’re sorry that you’re locked into obsolete firmware. The SRAM also can be used as a cache.

DDA uses an Intel 82355 bus master interface chip to connect with the EISA bus. This combination can support up to a burst transfer rate of 33 MBs per second. In real world applications, the DDA can sustain up to 5 MB per second transfer rates.

The DDA can handle up to ten 200 MB integrated drive electronics (IDE) drives for a total capacity of 2 GB. To work, the DDA must have at least 2 drives. The drives themselves have an average access time of 16 milliseconds. The speed of the DDA itself depends on its configuration.

The DDA can be set up to support simultaneous reads. In this mode, up to five concurrent unrelated data reads can occur at once. While ideal for network servers, this comes at the cost of fault tolerance. While in simultaneous seek mode, the protection of data redundancy is unavailable.

In DDA’s other mode, data striping works with Level 4 data guarding. In this setup, the DDA gains the bandwidth advantages of being able to read data from logically concurrent sectors across the width of the array.

Either mode makes Dell’s disks faster than their SLED counterparts. Your system requirements will determine which setup

will work best for you. Workstation users will clearly be better off with full Level 4 protection. Network administrators will have a much harder time deciding which mode to use.

Compatibility shouldn’t be a problem for anyone. From an operating system point of view, the DDA looks like the popular Adaptec 1540 SCSI controller. In addition, DDA directly supports MS-DOS, OS/2, Unix and Novell Netware.

Fifth Level

At Level 5, the parity disk bottleneck is broken. Parity information is stored directly on the data disks. This means that up to 85 percent of the disk can be used for data without the I/O hassles of Level 3. Even more important, Level 5 supports multiple simultaneous reads and writes.

The 5th level of RAID promises the most, but it’s also the hardest to create. A dedicated processor on the controller is a must for Level 5. The processor must handle not only making and tracking parity check bytes but it must be faster than greased lightening to handle the I/O demands.

There are three ways to put Level 5 to work. In the first, the existing data and parity is read and then a transient parity value generated by removing old data from the equation. This transient parity is then used with the new data to create the new parity value.

The second method uses data that will be not changed by the write transaction with the new data to create a new parity value. Afterwards, the new data and parity is written to disk.

The final way of obtaining parity values in Level 5 is not to bother reading existing data or parity values. Instead, the controller waits for two new bytes to be written and then creates the parity value from the incoming information. The advantage to this is that, the controller doesn’t need to waste time reading from the disk every time a write request comes in.

Well known hard disk manufacturer, Micropolis has been a leader in bringing RAID 5 to the marketplace. At this time, they are not shipping a RAID 5 product. There will soon be, however, a hardware implementation of RAID 5 for their Model 2112 1.085 GB drives.

Future RAID

Make no doubt about it, RAID systems are coming. With the coming of processors like the i960, it’s now possible to make controllers with the necessary smarts to deal with RAID’s processor demands. With that technical barrier out of the way, RAID controllers will enter the marketplace in increasing numbers as design problems are ironed out.

At this time, RAIDs are too expensive for any but the most demanding LAN or workstation users. The technology’s price will drop. As this happens, RAID designs’ speed and safety features will make them the mass storage systems of choice for the rest of the 1990s.

A version of this story was first published in Byte.

Tags: Infrastructure · Storage