Software RAID5 and LVM with the Etch Installer

Posted by ElizabethBevilacqua on Thu 22 Mar 2007 at 09:18

Our team at LinuxForce recently put together a Debian server with LVM on a software RAID5 volume. This has been possible through complex installation procedures in the past, but today the Debian Etch installer is capable of handling such an installation if you follow the proper steps, which I outline in this article.

Among other things, we needed the flexibility to write partition tables for Xen on the fly, dependability that would allow a generous replacement window when harddrives failed, and as little possibility of data loss and downtime through harddrive failure as possible.

Assumptions about our example for this article

1. Our partition table will be as follows:

  • 1G / - root
  • 1G /tmp - tmp
  • 3G /home - home
  • 3G /var - var
  • 500M swap

2. Our system has four drives (for RAID: three active drives, one hot spare)

3. The harddrives are SATA or SCSI, if you're using IDE drives keep in mind that all the sda1 references (for example) will be hda1.

4. Since this article is being written so close to the release of Etch as stable, there shouldn't be too many changes to these directions before release, but for reference we are using the Debian Etch RC1 installer on a netinst daily image downloaded March 13, 2007.

Before Installation

The first challenge was developing a partitioning scheme that would boot. lilo and grub cannot boot from RAID5 so an additional partition separate from the RAID5 had to be created for this purpose. There are many creative ways of doing this, but our solution was to create a 1 gig RAID1 root partition. Alternatively we could have created a smaller /boot partition.

Another consideration is where to put the swap partition. There is an argument for putting swap on it's own partition outside of the lvm (for speed, mostly). We made the decision to simply keep it on LVM for ease of administration.

Make sure any hardware Raid is turned off in the BIOS, we want all disks to appear separately in the partitioning table.

Creating RAID Volumes

Start the Debian Etch netinst as with a normal install.

At the Partition Disks menu choose Partitioning method: Manual

Delete any existing partitions so only "FREE SPACE" is listed.

Select the first drive and do the following:

  • Create new partition
  • New partition size: 1G (or the size you want root to be)
  • Type for the new partition: Primary
  • Location for the new partition: Beginning
  • Use as: physical volume for RAID
  • Done setting up the partition

This will create your RAID1 bootable section.

Then:

  • Create new partition
  • New partition size: <the remaining space on the drive>
  • Type for the new partition: Primary
  • Location for the new partition: Beginning
  • Use as: physical volume for RAID
  • Done setting up the partition

This will create your RAID5.

Partition the remaining three disks in the same way.

Now each partition on the drives will show up as partitioned as "K Raid"

Configure Software RAID

There is now an option to Configure Software Raid on your partitioning screen - select this.

At the next screen say "Yes" to write the changes to the storage devices and configure RAID.

Choose: Create MD device

  • Multidisk device type: RAID1
  • Number of active devices for the RAID1 array: 3
  • Number of spare devices for the Raid1 array: 1
  • Now you will choose the devices to use. Since you created the 1G partitions first, you will want sda1, sdb1, sdc1
  • The next screen will ask you which device you should use for the spare, choose sdd1

Now you will want to choose Create MD device again.

  • Multidisk device type: RAID5
  • Number of active devices for the RAID5 array: 3
  • Number of spare devices for the Raid1 array: 1
  • Now you will choose the devices to use. The first three should be the ones you wish to use as active.
  • The next screen will ask you which device you should use for the spare, select the only remaining option which should be sdd2

Now: Finish

You will be sent back to the partitioning screen.

Partition RAID1

Select the partition on your newly created Raid1 volume, it will say: "Use as: do not use" - select this line, hit enter and make it an ext3 / (root) partition

Done setting up this partition

Create Physical LVM Volume

Select the partition on your newly created RAID5 volume, it will say: "Use as: do not use" - select this line, hit enter and make it a "physical volume for LVM"

Done setting up the partition

Once this is completed the RAID5 volume will show up as partitioned as "K LVM"

Configure LVM

There is now an option to Configure the Logical Volume Manager on your partitioning screen - select this.

At the next screen say "Yes" to write the changes to the storage devices and configure LVM.

In the LVM Configuration screen you will first need to Create Volume Group:

  • We name this Group the same name as the server itself.
  • Choose device - should only be one choice: /dev/md1

Now Create Logical Volumes:

  • Select the volume group you just created (there should only be one option)
  • Create your logical volumes, one for each partition being created and name them: servername_var, servername_tmp, servername_swp and servername_home (note: you can name these whatever you want)
  • The size of these Logical Volumes is the size you want to make the actual partitions of /var /tmp swap and /home

Once you are finished and arrive back at the LVM configuration screen choose Finish

Partition LVM Volumes

Back at the partitioning screen you will see all the logical volumes in the partition table.

Partition each of these as you normally would (when you go into each parition it will say: it will say: "Use as: do not use" - select this line, hit enter to change), LV servername_var will be the entire /var partition, etc.

When partitioning is complete: Finish partitioning and write changes to disk.

Continue Debian installation as with a normal install.

Bootloader

When you get to the step where you need to install grub/lilo you want to install it on your RAID1 partition, md0. The Debian installer should figure this out on its own and you can agree to the default, but keep this in mind if any problems arise when you complete the installation and reboot the machine.

Notes

Optional step:

To make sure LVM doesn't get "confused" by the separate disks versus the RAID volume, we tell lvm only to start on the md1 block device:

Edit /etc/lvm/lvm.conf: Change the filter line to: filter = [ "a|/dev/md1|", "r/.*/" ]
(and make sure you only have one filter line)

We leave this as an optional step because you may have other reasons for looking for other block devices.

Written by: Elizabeth Bevilacqua, System Administrator at LinuxForce

Acknowledgements:

Stephen Gran, System Administrator at LinuxForce

CJ Fearnley, President of LinuxForce

References:

http://dev.jerryweb.org/raid/

http://ads.wars-nicht.de/blog/archives/54-Install-Debian-Etch-on-a-Software-Raid-1-with-S-ATA-disks.html

 

 


Posted by TRS-80 (130.95.xx.xx) on Thu 22 Mar 2007 at 12:06
Instead of running RAID 5 with a hot spare, you should consider running RAID 6. With the increasing size of hard disks, resyncing can take long enough that the chance of a second disk failure is non-trivial. And to quote the md(4) manpage "The performance for RAID6 is slightly lower but comparable to RAID5 in normal mode and single disk failure mode."

[ Parent | Reply to this comment ]

Posted by cef (59.167.xx.xx) on Thu 22 Mar 2007 at 14:47
I've had a few people comment to me (in person) that RAID 6 seems to have a few bugs still, not with the actual RAID code, but with assembly/maintenance of the array. Things like not being able to assemble an array if one of the disks in the array is faulty, etc, which is a real show-stopper in keeping things going once they are up. That said, I agree that RAID 6 would be a better solution.

I would also suggest that you stagger your spares for the RAID set on different drives.

eg: For the RAID 1 use sda1, sdb1, sdd1 for active, and sdc1 for the spare, while for the RAID5 using sda2, sdb2 and sdc2 for active, and sdd2 as spare.

The reasoning behind this is that if the system gets any decent use, then all disks will be in use regardless, and you won't get "spare" hardware failure on use. As the disk sits idle you really can't be sure that when it's needed to be called into use, it WILL actually be there. This way, as all disks are in use in some way, you'll KNOW when a spare fails (as one of the other active ones will die), and you can replace appropriately. It's no use having a spare if the damn thing fails during a rebuild, which is also around when other drives have a higher chance of failing (dying before, during or after the sustained load/activity of a rebuild).

It's probably also good to rotate these occasionally, to cycle the wear about. You might want to be careful with the MD partition containing /boot though, as the bios still tends to load off the first drive it sees, which is a problem if you've demoted it to a spare, and then updated your kernel.

[ Parent | Reply to this comment ]

Posted by Anonymous (68.185.xx.xx) on Thu 22 Mar 2007 at 15:40
One of the problems I have had in the past with software RAID is if the system looses power (UPS fails, which has recently happened) and the system shuts down. I am using Dell 2400 servers with RAID 1 and 5. RAID went together fine, but when system lost power, I was unable to recover and was unable to boot into the system. I am new to Linux and really like the speed and usability of the sw RAID. So, how does one ensure that the system will be able to boot normally when something like the above happens? Currently, my only solution (which has been working) is to run with hw raid (same configuration).

[ Parent | Reply to this comment ]

Posted by cef (59.167.xx.xx) on Fri 23 Mar 2007 at 02:23
If you mean that the boot drive won't boot, then it depends on what the failure is.

If the drive is totally dead, and you have specificed in the bios the other drives as part of the boot order, you should be fine.

If the drive is alive but corrupted, then you are pretty much seriously out of luck. This is the one real disavantage with software RAID. I could be wrong here, but I do not believe that GRUB actually knows too much about RAID, and may not know how to handle a non-fresh (eg: failed) disk in an array. Hence why I was suggesting to be cautious with rotating out sda1 in the axample above from an active drive to a spare, as the data will effectively still be on there, but it will be out of date.

Note that is an artifact of Software RAID, not specifically the Linux implementation of same.

BTW: I've had no problems with my software RAID 1's and 5's with the above setups, so I'm wondering what distro and versions of the kernel/raidtools you have, as that may explain some of the issues.

[ Parent | Reply to this comment ]

Posted by Anonymous (64.13.xx.xx) on Fri 23 Mar 2007 at 06:19
The distro was Sarge but unsure of version (approx 4-6 months ago) 2.6.10 to 18?? I have not tried the above but have been weary of giving sw raid another try since the failure. Again I am new but from what I remember I was using mdadm to manage the array.(?) I was also able to remove then add the drives back into the array with no issues with mdadm. This was one cool feature that I liked about sw raid (along with the performance)

No hardware issues when drives became corrupt, but not knowing anything about sw raid and having the system basically become un-usable was pretty shocking. I am also not sure if I was using GRUB or lilo (as some of my installs for some reason would only allow me to use lilo) but most installs where GRUB, so who knows, I say GRUB.

The system is a Dell 2400 with 4 drives. I was using the perc 2 controller (which had also let me down when one drive failed (something Dell says has been a problem and they feel my pain) and did not allow me to rebuild the array, hence another reason for the sw raid switch)

I am still exploring debian and have since installed etch onto the system with hw raid 0 and 5. I came across this article and hope it is something that will work out for me. I'm going to give sw raid another go and use the above guide. As the above really does not give any real idea to what I was using and possible issues with the sw raid, I'm gonna assume nature of sw raid, or possibly GRUB.

[ Parent | Reply to this comment ]

Posted by cef (59.167.xx.xx) on Sat 24 Mar 2007 at 01:20
The perc2 is SCSI, yes? That really does look like some sort of software issue, or possibly a management issue on the controller (not allowing you to boot off multiple drives, not detecting drives with ID's higher than the dead one, etc). Unfortunatley on the management side, there isn't much you can do to get around it. If it kept failing to see drives past the dead one, admittedly you could pull out the cable on the dead one and hopefully things should be fine.

If you were using Parallel ATA drives (standard IDE), then the following might apply: Using master/slave combinations in a RAID array is not a good idea. If the electronics of a PATA drive goes, it can take out the IDE bus, as unlike SCSI, they really weren't designed for hot-swapping, as the electronics (mostly) can't handle it. In the case of a master, this may stop the slave from being seen, meaning that the failure of a master drive causes the failure of two of the drives in your array, instead of just one. RAID 5 can only recover from one dead drive, so it's a bit of a problem.

All of the hardware raid controllers I have seen for Parallel ATA drives have 4 (or more) seperate controllers on board, every drive gets it's own cable and you run the drives in master mode. By going down the master/slave path, you lose such protection, so don't complain if it fails! Great for testing RAID code if you're a kernel hacker, but not useful for anything remotely resembling production.

Serial ATA doesn't suffer from these problems, as you don't daisy-chain the devices.

On the software side I do know that the raid tools have changed a bit, so your problems might have been caused by that end.

Also: Did you actually create a config file for mdadm that contained all the raid details? I've found the autodetect stuff (which is in the kernel) is not as reliable as it could be, so following that up with explicit definitions of what should be where seemed to help me in most of my situations. Most of the new installer stuff does that for you, so this only applies to older installs or manually created arrays.

PS: The hardware bit on PATA I gave for reference. I'm sure someone out there might find it useful. It was also predominantly typed in before I realised that the perc2 controller is SCSI. Rather than lose it, I simply edited it a bit.

[ Parent | Reply to this comment ]

Posted by Fakirrr (82.229.xx.xx) on Tue 27 Mar 2007 at 13:17
One thing should be added to this nice article in case this installation is being done on brand new pristine disks.

If Grub is being installed on the RAID1 boot sector rather than MBR and you are on x86 or x86_64, the debian installer will probably prompt you about having an MBR installed (as this is required for the BIOS to initially access the disk).

At this step you can only pick from one of the physical devices and not the RAID partitions.
So the MBR should be manually installed on the other disks as a post installation task to ensure that no disk is being left MBRless and so unusable by the BIOS.

This should be true with PATA hardware and is something i went through when performing RAID sanity tests after an etch install (a year ago or so).

Most of the time i have no specific requirements for an MBR, so i usually tend to install the bootloader on the MBR and then duplicate it by hand on the other disks.

[ Parent | Reply to this comment ]

Posted by rpetre (83.166.xx.xx) on Fri 6 Apr 2007 at 13:41
For the record, here's how I do the MBR replication:

# grub --no-floppy

device (hd0) /dev/sda
root (hd0,0)
setup (hd0)

device (hd0) /dev/sdb
root (hd0,0)
setup (hd0)

device (hd0) /dev/sdc
root (hd0,0)
setup (hd0)

... and so on.

Notes:
* --no-floppy speeds up grub's loading
* the 'device' trick insures that the 2nd stage and the kernel are loaded from the same disk as the MBR, provides some independence from the BIOS settings (i've seen some voodoo cases where this was required)
* to be noted that after the first disk, the grub-shell history is of great use: 3xup,bksp,b, enter, 3xup, enter, 3xup, enter , and so on ;)
* take great care that the raid1 is in sync, to insure that all the required files are in their final position on disk
* thanks to grub's architecture, this only has to be done when upgrading grub or when changing a disk, not on every reconfiguration or kernel upgrade.

[ Parent | Reply to this comment ]

Posted by Anonymous (62.142.xx.xx) on Tue 27 Mar 2007 at 13:48
Should these instructions work for the etch RC2 installer as well? I sent this how-to to a friend of mine, and he is complaining that the how-to doesn't follow the new installer. And I can't find the daily netinst image for 13/03/2007 on the net, either.

Thanks in advance!

[ Parent | Reply to this comment ]

Posted by Anonymous (85.145.xx.xx) on Tue 3 Apr 2007 at 01:23
To make sure LVM doesn't get "confused" by the separate disks versus the RAID volume, we tell lvm only to start on the md1 block device

And this is really important, last weak I have destroyed an lvm setup on a testing machine. It hanged while a raid1 was syncing and I was resizing a lvm logical volume on top of the raid :) Error message after reboot said someting about working with sda and sdb devices, not with the md device. I am not at the machine so it was easier for the friend to reinstall it, but maybe it could have been helped somehow though..

[ Parent | Reply to this comment ]

Posted by Anonymous (81.178.xx.xx) on Mon 9 Apr 2007 at 18:59
FYI, the Sarge installer could do this just as easily.

[ Parent | Reply to this comment ]

Posted by ElizabethBevilacqua (72.94.xx.xx) on Wed 11 Apr 2007 at 15:59
Since our boot/root partition is not on LVM the bugs in the Sarge installer with regards to booting and LVM would not apply and you're probably right. The Etch installer was able to solve most of these problems and make the process much easier if you want to do something more complex than the basics I outlined.

[ Parent | Reply to this comment ]

Posted by Anonymous (87.13.xx.xx) on Mon 30 Jul 2007 at 16:00
Hi I have a problem.
I have a debian etch stable.
I made this setup:
3 hd.
3 x 1 gb for /boot raid 1
3 x 15 gb for / raid 5 (lvm)
3 x 20 gb for /home raid 5 (lvm)
3 x 256 mb for swap raid 5 (lvm)

I made all during the installation process. I have used lvm on raid 5 partitions.
I have never used lvm before, "only" raid.

The problem is that i have always recompiled kernel by vanilla sources, and i never used initrd cause the "important" things i am usual to put "built in " in the kernel configuration.

This time, with lvm, it doesn't work:
so: Do I have left something out of the kernel? (is dm-mod the part about lvm, right?) ?
Do I need initrd the same?

The error I get during boot is (after raid is corretly started):
VFS: cannot open root device "mapper/name-of-the-volume" or unknown-block (0,0)
Please append a correct "root=" boot option
md0 (driver?)*
kernel panic not syncing vfs: unable to mount root fs on unknown-block (0,0)

* md0 is the raid 5 metadevice

Thanks in advance.

[ Parent | Reply to this comment ]

Posted by drdebian (194.208.xx.xx) on Thu 30 Aug 2007 at 18:53
Instead of putting / on the RAID1 outside of LVM with 1 GB, I'd only have put /home on a RAID1 outside the LVM, with a round 200MB. Make the rest a RAID5 and use that as LVM volume, containing /, swap etc.

This is much more flexible and allows Grub and LiLo to boot just fine. After all, the bootloader only needs the kernel and initrd to figure out how to access LVM on a RAID5.

[ Parent | Reply to this comment ]

Posted by Anonymous (212.129.xx.xx) on Tue 4 Nov 2008 at 14:30
you mean a 200MB /boot on a RAID1 outside the LVM (this works well indeed)

[ Parent | Reply to this comment ]

Posted by drdebian (194.208.xx.xx) on Tue 4 Nov 2008 at 16:22
Of course, thanks for pointing that out.

[ Parent | Reply to this comment ]

Posted by Anonymous (83.103.xx.xx) on Fri 15 Feb 2008 at 11:15
Hi All,
i've done everything here, but after using of my new server /dev/md0 is now 100%. How can increase it?

Thanks all.

[ Parent | Reply to this comment ]

Posted by Anonymous (85.180.xx.xx) on Mon 2 Mar 2009 at 12:52
Got the same problem over here. Would be great if somebody could post the steps that need to be taken to get some space off md1 and transfer it into md0...

[ Parent | Reply to this comment ]

Posted by Anonymous (84.255.xx.xx) on Sun 6 Apr 2008 at 10:48
What would be drawbacks of placing root filesystem on LVM too?

[ Parent | Reply to this comment ]

Posted by Anonymous (96.226.xx.xx) on Wed 8 Oct 2008 at 06:13
Except if you are running SPARC hardware in which case it won't let you install the boot loader on any md device.

First create the smallest partition you can on each drive at cylinder 0. The SILO boot loader will be installed there.

Any md devices may start at cylinder 1 (or higher). This took me a while to figure out.

Hope it helps somebody, somewhere.

[ Parent | Reply to this comment ]

Posted by sefarlow (75.53.xx.xx) on Sun 12 Apr 2009 at 01:52
[ View Weblogs ]
I have followed the steps in the artcile TWICE using the Debian LENNY installer and have not been successful yet. It hangs at the step where it does INSTALL and Configure software. earlier in the install, during the partitioning process, I had a message that the system need to be rebooted so the installer would know about the partitions. I did not do a reboot at that point because the article says nothing about it. I REALLY want to get his Dell Poweredge 2550 server with 4 73gB SCSI drives set up with Lenny as a home server. PLEASE help!!

[ Parent | Reply to this comment ]

Posted by Anonymous (60.234.xx.xx) on Tue 5 May 2009 at 07:26
I hope you have it working now.

My setup is similar, but did it this way: (drive size & RAM irrelevant)
4GB RAM, 3x500GB SATA2 HDD'S, Debian, RAID5, LVM:

Disk1: 1G ext3 /boot (& bootable), 20G software raid5, remainder software raid5
Disk2: 1G swap, 20G software raid5, remainder software raid5
Disk3: 1G swap, 20G software raid5, remainder software raid5
- you should end up with 3 partitions on each disk

Configure Software RAID
- the 20G software raid5 (3disks, 0spare) should be assembled into md0
- md0 should be formatted as ext3 and used for /
- the remainder (large) software raid5 (3disks, 0spare) partition should be assembled into md1
- md1 should be formatted as LVM

LVM Configure
- name your LVM Group
- if you get the reboot notice, click <Go back>, then enter a name for the grou p (hostname of server is fine)

The click finish partitioning write changes to disk, this should let you continue and go on with the Debian installer process

[ Parent | Reply to this comment ]

Posted by amagda (134.169.xx.xx) on Thu 11 Jun 2009 at 09:35
Hi you all,

I am a "young" system administrator and as you my guess I would like to use RAID5 on our Network Storage Server.

The computer itself is a Dell computer with 4 SATA HDD each of 1 Tb.

Now I need to store on the Hdd several things:

1 system /; /tmp; /boot
2 backup partition - will hold the daily backups from the main NFS server 1 Tb \home\backup
3 Documents, literature, - 500 Gb \home\bib
4 Projects, data - 500 Gb \home\doc
5 Old, completed Projects \home\doc_archive
3 SVN partition for all projects ??GB \home\SVN

I have thinking to make the partitions like this:

sda1 - \boot(10Gb) - RAID1; sda2 \SWAP(10Gb) - RAID1?; sda3 \tmp(10Gb) - RAID1; sda4 \home(970GB) - RAID5
sdb1 - \boot(10Gb) - RAID1; sdb2 \SWAP(10Gb) - RAID1?; sdb3 \tmp(10Gb) - RAID1; sdb4 \home(970GB) - RAID5
sdc1 - \(30Gb) - RAID1; sdc2 \home(970GB) - RAID5
sdd1 - \(30Gb) - RAID1; sdd2 \home(970GB) - RAID5

It is the first time I am using RAID so please let me know if I am doing it wrong or if it can be done better.

Now regarding the \home I have understand that one can use LVM to create partitions on the big \home folder.

Best regards and congratulations for the nice and extremely useful debian-administration web site.

Adrian

[ Parent | Reply to this comment ]

Posted by AJxn (91.95.xx.xx) on Wed 11 May 2011 at 21:39
[ View Weblogs ]
Much to late, but.

I have done this many times:

Each disk is partitioned like this,
* primary partition 500M RAID1
* primary partition uses rest of the disk as one RAID5 (or RAID6)

Primary 500M partition is a RAID1 just for /boot with ext3. This is needed because Grub can't boot from LVM (or RAID, it uses the one disk from the RAID1 as a singel disk). Don't forget to install Grub on the MBR on each disk, as mentioned above.
(Someone who knows how to set up this to be done automaticly each time GRUB is updated?)

The other partition with RAID5 is used as a physical volume for LVM.
Put that physical volume into one volume group (like a hard disk).
Create all logical volumes (like partitions) from that volume group.
Don't use up all space now. Save so you can expand your logical volumes and filesystems later.

But you should know that by know, but for anyone else who looks here. :)

[ Parent | Reply to this comment ]

Sign In

Username:

Password:

[Register|Advanced]

 

Flattr

 

Current Poll

What do you use for configuration management?








( 135 votes ~ 0 comments )