Creating a RAID array out of cheap USB disks

0.00 avg. rating (0% score) - 0 votes

A RAID array is a bunch of disks used together in co-operation to create a redundant storage facility. Hard drives are mechanical devices with moving parts, which makes them prone to failure. You can imagine what it means to spin the hard drive platters at 7200 rpm, for years without a pause in a typical server setup. Eventually, all disks die. A disk may run for a decade without a hinge, but another disk of the same type may die after a year. The problem is, there is reliable way to predict when a particular disk dies.

To fight this problem, a number of schemes have been devised. They collectively go by the name of RAID, or Redundant Array of Inexpensive disks. The common denominator for them is that they all use one or more extra disks to hold a second copy of the data, so that the failure of one disk will not interrupt service. An exception to this is RAID level 0, also known as striping, which really is no RAID at all, because it does not provide any kind of redundancy. As opposed to using the disks separately, striping will in fact increase the probability of a fatal disk crash. If one drive fails, the data on all drives will be lost. Striping is therefore useful only for temporary data storage.

A failed disk should be replaced as soon as possible, because in a typical situation the array can not sustain another disk failure before a new disk has been added and the array rebuilt (with the notable exception of RAID 6). The good thing is, this all can be done without having to put the data offline.

In this article I am going to show you how to create a RAID 5 array out of ordinary USB hard disk drives. Later on, I will show you how to extend the array with another disk to add some more space. The good thing about these drives is that they are cheap, and hot-pluggable. The result will obviously not be as fast as an array created out of SATA drives, but it serves well for storing your music, videos, backups etc. Running the command “hdparm -t” for my RAID array made out of SATA drives will give me a reading of about 68 MB/sec, as opposed to the USB array, which gets on average about 42 MB/sec. For many applications this is quite enough.

By the way, the commands in this article apply to other kinds of drives as well. In fact, the USB hard disks show up as regular SCSI or SATA disks. You can create a RAID array from all kinds of block devices, and even mix them. You could, for example, create a RAID 1 out of two SATA disks, and then add one USB disk to the array as a spare.

Setting up the array

That’s enough theory, now it is time to get our hands dirty. In the beginning, my /proc/partitions had lines like this:

Those were the four USB hard drives I wanted to use for my array.

Before creating the array, the individual disks must be prepared. First, you must create a single primary partition on each disk, and set the partition type to 0xFD – “Linux RAID autodetect”. This way both you and the kernel will easily see that the partition belongs to a RAID array.

Next, the command to create an array of three disks looks like this:

You should now have a device called /dev/md0. That is the disk you are going to have to use from now on. To see the details of your newly created array, run:

The command will show you some basic information about your array, as well as status information for each of the disks. A RAID 5 array will continue to work even if at most one drive has failed. When a new, working drive is inserted into the array, it will take some time to copy the relevant data to the new drive. This will be shown as the “rebuild status” in the detailed listing. For example my 500GB USB hard drives take several hours to rebuild.

You can get some interesting information also by looking at the file /proc/mdstat. That file comes straight from the kernel, and contains some very basic info. For a rebuilding array, this file contains an estimate of the time left. Your new md device should also show up in /proc/partitions now.

To take advantage of our redundant array, a filesystem must be created on it. You can no more use the /dev/sdX devices directly, as that would corrupt the array. You must use the device /dev/md0 instead.

Logical Volume Management

If you want, you can create a filesystem directly on your md device, but I would recommend using LVM2 instead. LVM stands for Logical Volume Manager, and it is quite a powerful tool for managing disks. To cut a long story short, it can be used to create, resize and move partitions very flexibly (on-line in almost every case), in addition to a number of other things.

Make sure you have the LVM2 userland tools installed. For Ubuntu, “apt-get install lvm2”.

First thing to do is create a physical volume out of our new md device:

Step number two is to create a volume group which will act as a container for our logical volumes:

myvg is just a name for our new volume group. You can use whatever name you want. It is used for distinguishing between volume groups, as there can be more than one in a given system.

A volume group is just an abstraction used to group one or more physical volumes into a single space which can be divided into logical volumes. A logical volume looks just like a disk partition to the kernel, and so a filesystem can be created on it.

Let’s take a look first at our volume group. Running the command vgs will give you a quick list of your volume groups:

As you see, you have now about 930 GB free space on your volume group. Next, let’s create a 100 GB logical volume:

The command lvs will give a list of volumes in the system:

The logical volume will now show up as the device /dev/myvg/mylv. To create a file system on it:

You can now mount it anywhere you want. Let’s create a mount point in /myfs and mount the new logical volume there:

Add a line to /etc/fstab to mount it automatically each boot.

By the way, in addition to the commands pvs, vgs and lvs, you can get more detailed information about the physical and logical volumes, and volume groups, with the commands pvdisplay, vgdisplay and lvdisplay.

Extending your array

The newest version of RAID support in the kernel includes support for extending a RAID 5 array. What that means is if you are running out of space, you can just plug in a new USB disk, extend the array, and then extend your filesystem. All this can be done with the filesystem online, and without ever rebooting your machine. Here’s how.

As you probably noticed, I intentionally left the fourth disk out of the array when first creating it. We will now add the fourth disk there.

The growing process consists of two commands. The first command will add the new disk to the array, making it a spare drive, and the second one will grow the array to the new size:

You can now see with “mdadm -D /dev/md0” or “cat /proc/partitions” that the array is rebuilding. The reshape process will likely take several hours, and the new space will not be available before that:

After the reshape is complete, you can add the new free space to the physical volume by running:

Pvresize will by default resize the physical volume to fill all available space on the disk. And this is just what we want. vgs will now show that we have more free disk space in myvg:

To add a terabyte to our logical volume:

The last step is to increase the filesystem. A tool called ext2online can be used to extend the filesystem without umounting it. Install it with “apt-get install ext2resize”. Run the command like this:

And that’s it. You have now added more free space to your filesystem without disrupting operation.

Detecting problems and managing failed disks

The mdadm tool includes a monitor mode which can be used to monitor the array, detect problems and let you know about them. On an Ubuntu system, a configuration file exists in /etc/mdadm/mdadm.conf. You can put your e-mail address on the MAILADDR line to receive e-mail when mdadm –monitor detects a problem. Just make sure your system can send e-mail properly.

If a disk dies and you have set up a spare disk, the kernel will automatically rebuild the array using the spare. Otherwise the array will remain usable in a degraded state, but will not sustain another disk failure before you add more disks to it.

You can remove the failed disk from the array with:

To add another disk to the array, use:

If the array is missing a disk, the disk will automatically be used for rebuilding it. If the array is complete, the disk will be added as a spare.

If you intentionally want to replace a drive with another one in a working array, run:

This will add disk /dev/sdX to the array and remove /dev/sdY from it. Disk sdY must be marked as failed to be able to remove it.

One thought on “Creating a RAID array out of cheap USB disks”

  1. To comment my own post, I’ve sometimes encountered a situation where two disks drop out of an array simultaneously, and the array refuses to start up. After a reboot, using mdadm –examine /dev/sdX, I can read all the disks’ RAID superblocks fine, so none of the disks is actually broken. Still the array will not start.

    I have managed to re-create the array like this:

    mdadm –stop /dev/md0
    mdadm –create /dev/md0 –level=5 –raid-devices=4 /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1

    A question will appear:

    mdadm /dev/sda1 appears to be part of an array:
    level=raid5 devices=4 ctime=Mon Aug 23 16:36:52 2010
    mdadm /dev/sdb1 appears to be part of an array:
    level=raid5 devices=4 ctime=Mon Aug 23 16:36:52 2010
    mdadm /dev/sdc1 appears to be part of an array:
    level=raid5 devices=4 ctime=Mon Aug 23 16:36:52 2010
    mdadm /dev/sdd1 appears to be part of an array:
    level=raid5 devices=4 ctime=Mon Aug 23 16:36:52 2010
    Continue creating array?

    After answering yes, the array started, and everything worked fine after a second reboot.

Leave a Reply