[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: LSI MegaRAID SAS 9240-4i hangs system at boot



On 6/14/2012 9:45 AM, Ramon Hofer wrote:
> On Thu, 14 Jun 2012 08:38:27 -0500
> Stan Hoeppner <stan@hardwarefreak.com> wrote:
> 
>> Couldn't hurt.  And while you're at it, mount with "inode64" in your
>> fstab immediately after you create the XFS.  You were running with
>> inode32, which sticks all the inodes at the front of AG0 causing lots
>> of seeks.  Inode64 puts file/dir inodes in the AG where the file gets
>> written.  In short, inode64 is more efficient for most workloads.  And
>> this is also why getting the agcount correct is so critical with
>> tiered linear/striped parity setups such as this.
>>
>> When you recreate the XFS use 'agcount=6'.  That's the smallest you
>> can go with 2TB disks.  A force will be required since you already
>> have an XFS on the device.
> 
> Sorry I haven't much time now. I'm invoted to a BBQ and already
> hungry :-)
> 
> I just wanted to create the filesystem and start to copy the files.
> 
> So I tried and got this warning:
> 
> ~$ sudo mkfs.xfs -f -d agcount=6,su=131072,sw=3 /dev/md0
> Warning: AG size is a multiple of stripe width.  This can cause
> performance problems by aligning all AGs on the same disk.  To avoid
> this, run mkfs with an AG size that is one stripe unit smaller, for
> example 244189120.

Grr.  This is another reason it is preferable to create the XFS atop the
linear array with both RAIDs already present, from the beginning, which
would allow the proper 11 AGs, and proper placement of them.

> Should I take this seriously?

This is a valid warning and relates to metadata performance, which is
important for everyday use.  So yeah, you should take it seriously.  So
what you should do now is, instead of making another attempt and
manually setting 7 AGs, just leave out that parm and let mkfs pick the
agcount/agsize on its own.  It will likely choose 7, but it may choose
more.  The fewer the better with 3 slow disks in this RAID5.  mkfs.xfs
doesn't take spindle speed into account, which is why I usually set
parms manually, to best fit the storage hardware.

> Btw: Should I mount every xfs filesystem (also the one for the mythtv
> recordings) with inode64.

Yes.  Especially with XFS atop a linear array.  The inode64 allocator
spreads directory and file metadata, and files relatively evenly across
all AGs, providing better locality between files and their metadata.
This improves performance for most workloads.

Inode32, the default allocator, puts all directory and file metadata in
AG0, so you end up with a hotspot, causing excessive disk seeking on the
first RAID5 (which is where AG0 is) in the linear array.

Inode64 will be the XFS default in the not too distant future.  It would
have been so already, but there are still some key applications in
production, namely some enterprise backup applications, that don't
understand 64bit inode numbers.  This is the only reason inode32 is
still the default.  Note than with 32bit Linux kernels you are limited
to inode32.  So make sure you're running an x64 kernel, which IIRC, you are.

Note that for any XFS filesystem greater than 16TB, you must use the
inode64 allocator as inode32 is limited to 16TB (and again you need an
x64 kernel). In your case you will be continuously expanding your XFS as
you add more 4 drive arrays in the future.  Once you add your 4x1.5TB
drives you'll be at 10.4TB.  When you add 3x4TB drives your XFS will hit
19.5TB.  It's best to already be using inode64 when you go over the 16TB
limit to avoid problems.

> This is not true for the smaller ext4 filesystems I use for the os and
> the home dir I suppose?

No, the inode64 mount option is unique to XFS.  It simply tells the XFS
kernel driver to use the inoe64 code path instead the inode32 code path
for a given XFS filesystem, in essence passing a 0 or 1 to an XFS
variable.  You can mount multiple XFS filesystems on one machine, some
with inode32 and others with inode64.  See:  'man mount'  XFS is waay
down at the bottom.  Note that it's possible, but not advisable, to
change the inodeXX mount option after the filesystem has some "age" on
it.  Pick the right one from the start and stick with it.  This is
usually inode64.  There are some unique workload cases where a highly
tweaked inode32 filesystem <16TB has a performance advantage, but your
workloads aren't such cases.

And make sure you're using linux-image-3.2.0-0.bpo.2-amd64 so you have
all the latest XFS features and fixes, mainly the delayed logging code
turned on by default.

-- 
Stan


Reply to: