Re: LSI MegaRAID SAS 9240-4i hangs system at boot
On Tue, 12 Jun 2012 17:30:43 -0500
Stan Hoeppner <stan@hardwarefreak.com> wrote:
> On 6/12/2012 8:40 AM, Ramon Hofer wrote:
> > On Sun, 10 Jun 2012 17:30:08 -0500
> > Stan Hoeppner <stan@hardwarefreak.com> wrote:
>
> >> Try the Wheezy installer. Try OpenSuSE. Try Fedora. If any of
> >> these work without lockup we know the problem is Debian 6.
> >> However...
> >
> > I didn't do this because it the LSI worked with the Asus mobo and
> > Debian squeeze. And because I couldn't install OpenSuSE nor Fedora.
> > But I will give it another try...
>
> Your problem may involve more than just the two variables. The
> problem may be mobo+LSI+distro_kernel, not just mobo+LSI. This is
> why I suggested trying to install other distros.
Aha, this is true - didn't think about this...
> >> Please call LSI support before you attempt any additional
> >> BIOS/firmware updates.
>
> Note I stated "call". You're likely to get more/better
> information/assistance speaking to a live person.
I didn't have enough confidence in my oral english :-(
> > It sounds like the issue is related to the bootstrap, so either to
> > resolve the issue you will have to free up the option ROM space or
> > limit the number of devices during POST."
>
> This is incorrect advice, as it occurs with the LSI BIOS both enabled
> and disabled. Apparently you didn't convey this in your email.
I will write it to them again.
But to be honest I think I'll leave the Supermicro and use it for my
Desktop.
(...)
> > Nono, I was aware that I can have several RAID arrays.
> > My initial plan was to use four disks with the same size and have
> > several RAID5 devices.
>
> This is what you should do. I usually recommend RAID10 for many
> reasons, but I'm guessing you need more than half of your raw storage
> space. RAID10 eats 1/2 of your disks for redundancy. It also has the
> best performance by far, and the lowest rebuild times by far. RAID5
> eats 1 disk for redundancy, RAID6 eats 2. Both are very slow compared
> to RAID10, and both have long rebuild times which increase severely as
> the number of drives in the array increases. The drive rebuild time
> for RAID10 is the same whether your array has 4 disks or 40 disks.
Yes, I think for me raid5 is sufficient. I don't need extreme
performance nor extreme security. I just hope that the raid5 setup will
be enough safe :-)
> If you're more concerned with double drive failure during rebuild (not
> RESHAPE as you stated) than usable space, make 4 drive RAID10 arrays
> or 4 drive RAID6s, again, without partitions, using the command
> examples I provided as a guide.
Well this is just multimedia data stored on this server. So if I loose
it it won't kill me :-)
> > Is there some documentation why partitions aren't good to use?
> > I'd like to learn more :-)
>
> Building md arrays from partitions on disks is a means to an end. Do
> you have an end that requires these means? If not, don't use
> partitions. The biggest reason to NOT use partitions is misalignment
> on advanced format drives. The partitioning utilities shipped with
> Squeeze, AFAIK, don't do automatic alignment on AF drives.
Ok, I was just confused because most the tutorials (or at least most of
the ones I found) use partitions over the whole disk...
> If you misalign the partitions, RAID5/6 performance will drop by a
> factor of 4, or more, during RMW operations, i.e. modifying a file or
> directory metadata. The latter case is where you really take the
> performance hit as metadata is modified so frequently. Creating md
> arrays from bare AF disks avoids partition misalignment.
So if I can make things simpler I'm happy :-)
> > Does it work as well with hw RAID devices from the LSI card?
>
> Your LSI card is an HBA with full RAID functions. It is not however a
> full blown RAID card--its ASIC is much lower performance and it has no
> cache memory. For RAID1/10 it's probably a toss up at low disk counts
> (4-8). At higher disk counts, or with parity RAID, md will be faster.
> But given your target workloads you'll likely not notice a difference.
You're right.
I just had the impression that you'd suggested that I'd use the hw raid
capability of the lsi at the beginning of this conversation.
> >> Then make a write aligned XFS filesystem on this linear device:
> >>
> >> ~$ mkfs.xfs -d agcount=11 su=131072,sw=3 /dev/md2
> >
> > Are there similar options for jfs?
>
> Dunno. Never used as XFS is superior in every way. JFS hasn't seen a
> feature release since 2004. It's been in bug fix only mode for 8
> years now. XFS has a development team of about 30 people working at
> all the major Linux distros, SGI, and IBM, yes, IBM. It has seen
> constant development since it's initial release on IRIX in 1994 and
> port to Linux in the early 2000s.
I must have read outdated wikis (mostly from the mythtv project).
> > Especially because I read in wikipedia that xfs is
> > integrated in the kernel and to use jfs one has to install
> > additional packages.
>
> You must have misread something. The JFS driver was still in mainline
> as of 3.2.6, and I'm sure it's still in 3.4 though I've not confirmed
> it. So you can build JFS right into your kernel, or as a module. I'd
> never use it, nor recommend it, I'm just squaring the record.
I found this information in the german wikipedia
(http://de.wikipedia.org/wiki/XFS_%28Dateisystem%29):
"... Seit Kernel-Version 2.6 ist es offizieller Bestandteil des
Kernels. ..."
Translated: Since kernel version 2.6 it's an official part of the
kernel.
Maybe I misunderstood this sentence in what the writer meant or maybe
it's even wrong what they wrote in the first place :-?
> > Btw it seems very complicated with all the allocation groups, stripe
> > units and stripe width.
>
> Powerful flexibility is often accompanied by a steep learning curve.
True :-)
> > How do you calculate these number?
>
> Beginning users don't. You use the defaults. You are confused right
> now because I lifted the lid and you got a peek inside more advanced
> configuations. Reading the '-d' section of 'man mkfs.xfs' tells you
> how to calculate sunit/swidth, su/sw for different array types and
> chunk sizes.
Ok if I read it right it divides the array into 11 allocation groups,
with 131072 byte blocks and 3 stripe units as stripe width.
But where do you know what numbers to use?
Maybe I didn't read the man carefully enough then I'd like to
appologize :-)
> Please read the following very carefully. IF you did not want a
> single filesystem space across both 4 disk arrays, and the future 12
> disks you may install in that chassis, you CAN format each md array
> with its own XFS filesystem using the defaults. In this case,
> mkfs.xfs will read the md geometry and create the array with all the
> correct parameters--automatically. So there's nothing to calculate,
> no confusion.
>
> However, you don't want 2 or 6 separate filesystems mounted as
> something like:
>
> /data1
> ...
> /data6
>
> in your root directory. You want one big filesystem mounted in your
> root as something like '/data' to create subdirs and put files in,
> without worrying about how much space you have left in each of 6
> filesystems/arrays. Correct?
Yes, this is very handy :-)
> The advanced configuration I previously gave you allows for one large
> XFS across all your arrays. mkfs.xfs is not able to map out the
> complex storage geometry of nested arrays automatically, which is why
> I lifted the lid and showed you the advanced configuration.
Ok, this is very nice!
But will it also work for any disk size (1.5, 2 and 3 TB drives)?
> With it you'll get a minimum filesystem bandwidth of ~300MB/s per
> single file IO and a maximum of ~600MB/s with 2 or more parallel file
> IOs, with two 4-drive arrays. Each additional 4 drive RAID5 array
> grown into the md linear array and then into XFS will add ~300MB/s of
> parallel file bandwidth, up to a maximum of ~1.5GB/s. This should
> far exceed your needs.
This really is enough for my needs :-)
> > And why do both arrays have a stripe width of 384 KB?
>
> You already know the answer. You should anyway:
>
> chunk size = 128KB
This is what I don't know.
Is this a characteristic of the disk?
> RAID level = 5
> No. of disks = 4
> ((4-1)=3)) * 128KB = 384KB
This is traceable.
> > Is it also true that I will get better performance with two hw RAID5
> > arrays?
>
> Assuming for a moment your drives will work in RAID mode with the
> 9240, which they won't, the answer is no. Why? Your CPU cores are
> far faster than the ASIC on the 9240, and the board has no battery
> backed cache RAM to offload write barriers.
>
> If you step up to one of the higher end full up RAID boards with BBWC,
> and the required enterprise drives, then the answer would be yes up to
> the 20 drives your chassis can hold. As you increase the drive
> count, at some point md RAID will overtake any hardware RAID card, as
> the 533-800MHz single/dual core RAID ASIC just can't keep up with the
> cores in the host CPU.
Very interesting!
> > What if I loose a complete raid5 array which was part of the linear
> > raid array? Will I loose the whole content from the linear array as
> > I would with lvm?
>
> Answer1: Are you planning on losing an entire RAID5 array? Planning,
> proper design, and proper sparing prevents this. If you lose a drive,
> replace it and rebuild IMMEDIATELY. Keep a spare drive on hand, or
> better yet in standby. Want to eliminate this scenario? Use RAID10
> or RAID6, and live with the lost drive space. And still
> replace/rebuild a dead drive immediately.
>
> Answer2: It depends. If this were to happen, XFS will automatically
> unmount the filesystem. At that point you run xfs_repair. If the
> array that died contained the superblock and AG0 you've probably lost
> everything. If it did not, the repair may simply shrink the
> filesystem and repair any damaged inodes, leaving you with whatever
> was stored on the healthy RAID5 array.
This sounds suitable for my needs.
Just another question: The linear raid will distribute the data to the
containing raid5 arrays?
Or will it fill up the first one and continue with the second and so on?
> > I'm still aware that 3 TB raid5 rebuilds take long.
>
> 3TB drive rebuilds take forever, period. As I mentioned, it takes ~8
> hours to rebuild a mirror.
>
> > Nevertheless I think
> > I will risk using normal (non-green) disks for the next expansion.
>
> What risk? Using 'normal' drives will tend to reduce RAID related
> green drive problems.
Ok, I will use normal drives in the future and hope that the green
drives wont give up at the same time :-/
> > If I'm informed correctly there are not only green drives and normal
> > desktop drives but also server disks with a higher quality than
> > desktop disks.
>
> Yes, and higher performance. They're called "enterprise" drives.
> There are many enterprise models: 7.2K SATA/SAS, 10K SATA/SAS, 15K
> SAS, 2.5" and 3.5"
>
> > But still I don't want to "waste" energy.
>
> Manufacturing a single drive consumes as much energy as 4 drives
> running for 3 years. Green type drives tend to last half as long due
> to all the stop/start cycles wearing out the spindle bearings. Do
> the math. The net energy consumption of 'green' drives is therefore
> equal to or higher than 'normal' drives. The only difference is that
> a greater amount of power is consumed by the drive before you even
> buy it. The same analysis is true of CFL bulbs. They consume more
> total energy through their life cycle than incandescents.
Hmm, I knew that for hybrid cars but never thought about this for
hdds.
> > Would the Seagate Barracuda
> > 3TB disks be a better choise?
>
> Is your 10.5TB full already? You don't even have the system running
> yet...
No, but I like living in the future ;-)
> > My needs are probably *much* less demanding than yours.
> > Usually it only has to do read access to the files. Aditionally
> > copying bluray rips to it. But most of the the it sits around doing
> > nothing (the raid). MythTV records almost most of the time but to a
> > non RAID disk.
> > So I hope with non-green 3 TB disks I can get some security from the
> > redundancy and still get a lot of disk space.
>
> If you have a good working UPS, good airflow (that case does), and
> decent quality drives, you shouldn't have to worry much. I'm unsure
> of the quality of the 3TB Barracuda, haven't read enough about it.
>
> Are you planning on replacing all your current drives with 4x 3TB
> drives? Or going with the linear over RAID5 architecture I
> recommended, and adding 4x 3TB drives into the mix?
I'm planning to keep the drives I have now and add 4x 3TB into the mix.
> > This was exactly what I had in mind at the first place. But the
> > suggestion from Cameleon was so tempting :-)
>
> Cameleon helps many people with many Debian/Linux issues and is very
> knowledgeable in many areas. But I don't recall anyone accusing her
> of being a storage architect. ;)
Her suggestion seemed very tempting because it would give me a raid6
without having to loose too much storage space.
She really knows a lot so I was just happy with her suggesting me this
setup.
> > Btw I have another question:
> > Is it possible to attach the single (non raid) disk I now have in
> > my old server for the mythtv recordings to the LSI controller and
> > still have access to the content when it's configured as jbod?
> > Since there are recordings which it wouldn't be very bad if I loose
> > them I'd like to avoid backing this up.
>
> Drop it in a drive sled, plug it into the backplane, and find out. If
> you configure it for JBOD the LSI shouldn't attempt writing any
> metadata to it.
Ok, thanks I will do that :-)
Again thanks alot for all your help and your patience with me.
Certainly not always easy ;-)
Cheers
Ramon
Reply to: