[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: LSI MegaRAID SAS 9240-4i hangs system at boot



On 06/12/2012 09:40 AM, Ramon Hofer wrote:
> On Sun, 10 Jun 2012 17:30:08 -0500
> Stan Hoeppner <stan@hardwarefreak.com> wrote:
> 
>> On 6/10/2012 9:00 AM, Ramon Hofer wrote:
>>> A situation update: Mounted the mobo with the CPU and RAM, attached
>>> the PSU, the OS SATA disk, the LSI and expander as well as the
>>> graphics card. There are no disks attached to the expander because
>>> I put them again into the old NAS and backing up the data from the
>>> 1.5 TB disks to it.
>>>
>>> Then I installed Debian Squeeze AMD64 without problems. I don't have
>>> the over-current error messages anymore :-)
>>> But it still hangs at the same time as before.
>>
>> Try the Wheezy installer.  Try OpenSuSE.  Try Fedora.  If any of these
>> work without lockup we know the problem is Debian 6.  However...
> 
> I didn't do this because it the LSI worked with the Asus mobo and
> Debian squeeze. And because I couldn't install OpenSuSE nor Fedora.
> But I will give it another try...
> 
> 
>> Please call LSI support before you attempt any additional
>> BIOS/firmware updates.
> 
> I mailed them and got this answer:
> 
> "Unfortunately, the system board has not been qualified on the hardware
> compatibility list for the LSI MegaRAID 9240 series controllers. There
> could be any number of reason for this, either it has not yet been
> tested or did not pass testing, but the issue is likely an
> incompatibility.
> 
> It sounds like the issue is related to the bootstrap, so either to
> resolve the issue you will have to free up the option ROM space or
> limit the number of devices during POST."
> 
> This is what you've already told me.
> If I understand it right you already told me to try both: free up the
> option ROM and limit the number of devices, right?
> 
> 
> (...)
> 
>>> Thanks again very much.
>>> The air flow / cooling argument is very convincing. I haven't
>>> thought about that.
>>
>> Airflow is 80% of the reason the SAS and SATA specifications were
>> created.
> 
> You've convinced me: I will mount the expander properly to the case :-)
> 
> 
>>> It was the P7P55D premium.
>>>
>>> The only two problems I have with this board is that I'd have to
>>> find the right BIOS settings to enable the LSI online setting
>>> program (or how is it called exactly?) where one can set up the
>>> disks as JBOD / HW RAID.
>>
>> I already told you how to do this with the C7P67.  Read the P7P55D
>> manual, BIOS section.  There will be a similar parameter to load the
>> BIOS ROMs of add in cards.
> 
> Ok, thanks!
> 
> 
>>> Sorry I don't understand what you mean by "don't put partitions on
>>> your mdraid devices before creating the array".
>>> Is it wrong to partition the disks and the do "mdadm --create
>>> --verbose /dev/md0 --auto=yes --level=6
>>> --raid-devices=4 /dev/sda1.1 /dev/sdb1.1 /dev/sdc1.1 /dev/sdd1.1"?
>>>
>>> Should I first create an empty array with "mdadm --create
>>> --verbose /dev/md0 --auto=yes --level=6 --raid-devices=0"
>>>
>>> And then add the partitions?
>>
>> Don't partition the drives before creating your md array.  Don't
>> create partitions on it afterward.  Do not use any partitions at
>> all.  They are not needed.  Create the array from the bare drive
>> device names.  After the array is created format it with your
>> preferred filesystem, such as:
>>
>> ~$ mkfs.xfs /dev/md0
> 
> Ok understood. RAID arrays containing partitions are bad.
> 
> 
>>> Hmm, that's a very hard decision.
>>> You probably understand that I don't want to buy 20 3 TB drives
>>> now. And still I want to be able to add some 3 TB drives in the
>>> future. But at
>>
>> Most novices make the mistake of assuming they can only have one md
>> RAID device on the system, and if they add disks in the future they
>> need to stick them into that same md device.  This is absolutely not
>> true, and it's not a smart thing to do, especially if it's a parity
>> array that requires a reshape, which takes dozens of hours.
>> Instead...
> 
> Nono, I was aware that I can have several RAID arrays.
> My initial plan was to use four disks with the same size and have
> several RAID5 devices. But Cameleon from the debian list told me to not
> use such big disks (>500 GB) because reshaping takes too long and
> another failure during reshaping will kill the data. So she proposed to
> use 500 GB partitions and RAID6 with them.
> 
> Is there some documentation why partitions aren't good to use?
> I'd like to learn more :-)
> 
> 
>>> the moment I have four Samsung HD154UI (1.5 TB) and four WD20EARS (2
>>> TB).
>>
>> You create two 4 drive md RAID5 arrays, one composed of the four
>> identical 1.5TB drives and the other composed of the four identical
>> 2TB drives.  Then concatenate the two arrays together into an md
>> --linear array, similar to this:
>>
>> ~$ mdadm -C /dev/md1 -c 128 -n4 -l5 /dev/sd[abcd]  <-- 2.0TB drives
> 
> May I ask what the -c 128 option means? The mdadm man page says that -c
> is to specify the config file?
> 
> 
>> ~$ mdadm -C /dev/md2 -c 128 -n4 -l5 /dev/sd[efgh]  <-- 1.5TB drives
>> ~$ mdadm -C /dev/md0 -n2 -l linear /dev/md[12]
> 
> This is very interesting. I didn't know that this is possible :-o
> Does it work as well with hw RAID devices from the LSI card?
> Since you tell me that RAIDs with partitions aren't wise I'm thinking
> about creating hw RAID5 devices with four equally sized disks.
> 
> The -C option means that mdadm creates a new array with the
> name /dev/md1.
> Is it wise to use other names, e.g. /dev/md_2T, /dev/md_1T5
> and /dev/md_main?
> 
> And is a linear raid array the same as RAID0?
> 
> 
>> Then make a write aligned XFS filesystem on this linear device:
>>
>> ~$ mkfs.xfs -d agcount=11 su=131072,sw=3 /dev/md2
> 
> Are there similar options for jfs?
> I decided to use jfs when I set up the old server because it's easier
> to grow the filesystem.
> But when I see the xfs_grow below I'm not sure if xfs wouldn't be the
> better choice. Especially because I read in wikipedia that xfs is
> integrated in the kernel and to use jfs one has to install additional
> packages.
> 
> Btw it seems very complicated with all the allocation groups, stripe
> units and stripe width.
> How do you calculate these number?
> And why do both arrays have a stripe width of 384 KB?
> 
> 
>> The end result is a 10.5TB XFS filesystem that is correctly write
>> stripe aligned to the 384KB stripe width of both arrays.  This
>> alignment prevents extra costly unaligned RMW operations (which
>> happen every time you modify an existing file).  XFS uses allocation
>> groups for storing files and metadata and it writes to all AGs in
>> parallel during concurrent access.  Thus, even though the your
>> spindles are separated into two different stripes instead of one
>> large stripe, you still get the performance of 6 spindles.  Two RAID
>> 5 arrays actually give better performance, as you will have two md
>> threads instead of one, allowing two CPU cores to do md work instead
>> of only one with md RAID6.
> 
> Is it also true that I will get better performance with two hw RAID5
> arrays?
> 
> 
>> So now you've run out of space or nearly so, and need to add more.
>> Simple.  Using four new drives (so our array geometry remains the
>> same), say 3TB models, you'd create another RAID5 array:
>>
>> ~$ mdadm -C /dev/md3 -c 128 -n4 -l5 /dev/sd[ijkl]
>>
>> Now we grow the linear array:
>>
>> ~$ mdadm --grow /dev/md0 --add /dev/md3
>>
>> And now we grow the XFS filesystem:
>>
>> ~$ xfs_growfs /your/current/xfs/mount_point
>>
>> Now your 10.5TB XFS filesystem is 19.5TB and has 9TB additional free
>> space, with additional AGs, still aligned to the RAID stripe size of
>> the md RAID arrays, which are all identical at 384KB.  And unlike an
>> md reshape of an 8 drive RAID6 array which can take over 36 hours,
>> the XFS grow operation takes a few seconds.  Creating the new 4 drive
>> array will take much longer, but not nearly as long as a reshape of
>> an 8 drive array involving 4 new drives.
>>
>>> Actually I've just saw that the Samsungs are green drives as well.
>>
>> I fear you may suffer more problems down the road using WDEARS drives
>> in md RAID, or any green drives.
> 
> What if I loose a complete raid5 array which was part of the linear
> raid array? Will I loose the whole content from the linear array as I
> would with lvm?
> 
> 
>>> The reason why I bought green drives is that the server provides
>>> mythbackend, nas, logitech media server, etc.
>>> So it doesn't have much to do but it still should be ready all the
>>> time (if I wan't to listen to music I don't want to power the
>>> squeezebox radio which triggers the server to start up and only
>>> when it started I can listen to music which would probably take >1
>>> min. So I thought the drives should manage themselves to save some
>>> power.
>>
>>> I understand that there may be timing problems. But do they make it
>>> impossible?
>>
>> Just make sure you don't have any daemons accessing directories on the
>> md array(s) and you should be fine.  IIRC the WD Green drives go to
>> sleep automatically after 30 seconds of inactivity, and park the heads
>> after something like <10 seconds.  I'll personally never own WD Green
>> drives due to their reputation for failure and anemic performance.
> 
> Thanks for the advice!
> 
> About the failure: This is why I use raid5 and as I don't need very
> high performance this doesn't matter for me.
> But I understand that they are risky.
> 
> I'm still aware that 3 TB raid5 rebuilds take long. Nevertheless I think
> I will risk using normal (non-green) disks for the next expansion.
> 
> If I'm informed correctly there are not only green drives and normal
> desktop drives but also server disks with a higher quality than
> desktop disks.
> 
> But still I don't want to "waste" energy. Would the Seagate Barracuda
> 3TB disks be a better choise?
> 
> 
>>> What would you do if you?
>>>
>>> Let's say I'd "throw away" these disks and go for 3 TB drives. At
>>> the
>>
>> I wouldn't.  3TB drives take far too long to rebuild.  It takes about
>> 8 hours to rebuild one in a mirror pair, something like 30+ hours to
>> rebuild a drive in a 6 drive RAID6.  If a 3TB drive fails due to
>> age/wear, and your drives are identical, the odds of having two more
>> drive failures before the rebuild completes are relatively high.  If
>> this happens, you better have a big full backup handy.  Due to this
>> and other reasons, I prefer using drives of 1TB or less.  My needs are
>> different than yours, however--production servers vs home use.
> 
> My needs are probably *much* less demanding than yours.
> Usually it only has to do read access to the files. Aditionally copying
> bluray rips to it. But most of the the it sits around doing nothing
> (the raid). MythTV records almost most of the time but to a non RAID
> disk.
> So I hope with non-green 3 TB disks I can get some security from the
> redundancy and still get a lot of disk space.
> 
> 
>>> moment four in a RAID 6 array would be enough. So I'd have 6 TB
>>> available.
>>
>> Never build a RAID6 array with less than 6 drives, as RAID10 on 4
>> drives gives vastly superior performance vs a 5/6 drive RAID6, and
>> rebuild times are drastically lower for mirrors.  Rebuilding a RAID1
>> 3TB drive takes about 8 hours, and you're looking at something North
>> of 30 hours for a 6 drive RAID6 rebuild.
> 
> Because you told me that it's not good to use partitions I won't set up
> raid6.
> Instead I'll go for raid5 with 4 disks.
> 
> 
>>> Then I'd run out of space and want to upgrade with another disk.
>>> Probably it'll still be available but will it also be when I'll
>>> have 19 disks and want to add the last one?
>>> Just as an example to explain my worries ;-)
>>
>> Start with the 4 drive RAID5 arrays I mentioned above.  You have a 20
>> cage case, 4 rows of 5 cages each.  Put each array in its own 4 cage
>> row to keep things organized.  You have two rows empty as you have 8
>> drives currently.  When you expand in the future, only expand 4
>> drives at a time, using the instructions I provided above.  You can
>> expand two times.  Using 2TB drives you'll get +6TB twice for a total
>> of 22.5TB.
> 
> This was exactly what I had in mind at the first place. But the
> suggestion from Cameleon was so tempting :-)
> 
> 
> Btw I have another question:
> Is it possible to attach the single (non raid) disk I now have in my old
> server for the mythtv recordings to the LSI controller and still have
> access to the content when it's configured as jbod?
> Since there are recordings which it wouldn't be very bad if I loose
> them I'd like to avoid backing this up.
> 
> 
> Cheers
> Ramon
> 
> 

	I don't know if the problem I experienced with a LSI MegaRAID SAS
9265-8i is the same as what you're experiencing but I found that
Debian's kernel provided does not have the megaraid_sas.ko driver new
enough to support the card. I had to go to LSI's site and get the
updated driver from them. Unfortunately they only had a driver compiled
for the 5.0.x installer kernel version and the source didn't compile
cleanly against the 6.0.x kernel header packages. I ended up having to
patch the code to clean up the compile issues and was then able to
compile a megaraid_sas driver that I could use on a 6.0 install.


Reply to: