[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Question about HotPlug and cciss hp storage



Hello everybody,

Frank, we have decided to use soft RAID to optimise our supervision and to optimise our RAID with all the tool that linux provide !
And for me it's better to optimise a RAID5 (RAID6) than use a non optimised RAID10 :
- no more HDD "lost", we have 8 disks, so 4 "real" in RAID10 and 6 in RAID6
- Linux kernel optimisation give us amazing transfert rate, and I/O
- with MD and LVM we can align I/O strictly 

But here is not the discussion.

Return to hpacucli...
I have used this tools to remove 2 HDD "unassigned", every thing was OK until the arrayprobe script give me a warning !
Am I unlucky ??

# arrayprobe 
WARNING Arrayprobe Logical drive 1 on /dev/cciss/c0d0: Logical drive is not configured

# arrayprobe -r 
[...]
Event code 1/0/0 with tag 15
at 2-28-2012 08:26:30
with message: Hot-plug drive removed, Port=1I Box=1 Bay=3 SN=                    

Event code 4/0/0 with tag 16
at 2-28-2012 08:26:30
with message: Physical drive failure, Port=1I Box=1 Bay=3
physical drive 2 has failed with failurecode 20.

Event code 0/0/0
with message: No events to report.

failed to open device /dev/ida/c0d0: No such file or directory
Logical drive 0 on controller /dev/cciss/c0d0 has state 0
Logical drive 1 on controller /dev/cciss/c0d0 has state 2
Logical drive 2 on controller /dev/cciss/c0d0 has state 2
Logical drive 3 on controller /dev/cciss/c0d0 has state 0
WARNING Arrayprobe Logical drive 1 on /dev/cciss/c0d0: Logical drive is not configured

Does my Logical dive 1 mapped on /dev/cciss/c0d0 is not configured... impossible this is my "system RAID"
# hpacucli ctrl slot=0 ld 1 show

Smart Array P410i in Slot 0 (Embedded)

   array A

      Logical Drive: 1
         Size: 136.7 GB
         Fault Tolerance: RAID 1
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 35132
         Strip Size: 256 KB
         Status: OK
         Array Accelerator: Enabled
         Unique Identifier: 600508B1001C02B880CF0DDEE7FD0FEC
         Disk Name: /dev/cciss/c0d0
         Mount Points: /boot 190 MB
         OS Status: LOCKED
         Logical Drive Label: A00FD4B45001438009E6ADD046E6
         Mirror Group 0:
            physicaldrive 1I:1:1 (port 1I:box 1:bay 1, SAS, 146 GB, OK)
         Mirror Group 1:
            physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SAS, 146 GB, OK)

So, what's wrong now ?? 
I request your help again on this point, and i hope it will be the last one...

-- 
JG

Le 23 février 2012 21:38, Stan Hoeppner <stan@hardwarefreak.com> a écrit :
On 2/23/2012 10:16 AM, Julien Groselle wrote:

> Now i'm sure that is a must have.
> Since 4 years to last year, we just had Hardware RAID, so we didn't need to
> do any actions on HDD...
> Now with md RAID we need ! :)

RAID 0 arrays are not fault tolerant, so there is nothing the controller
can do when a single drive configured as such fails.  RAID 1 mirrors,
however, are fault tolerant.

Thus, the proper way to do what you are attempting to do, with
proprietary RAID cards, is to use hybrid nested hardware/mdraid arrays.
 For example, if you want a straight mdraid 10 array but you still want
the RAID card to handle drive fail/swap/rebuild automatically as it did
in the past, you would create multiple RAID 1 mirrors in the controller
and set the rebuild policies as you normally would.  Then you create an
mdraid 0 stripe over the virtual drives exported by the controller,
giving you a hybrid soft/hardware RAID 10.

You likely won't see much performance gain with this setup vs. using a
single RAID card with hardware RAID 10.  The advantage of this setup
really kicks in when you create the mdraid 0 stripe across many RAID 1
mirrors residing on 2 or more hardware RAID controllers.  The 3 main
benefits of this are:

1.  Striping can occur across many more spindles than can be achieved
   with a single RAID card
2.  You keep the hardware write cache benefit
3.  Drive failure/replace/rebuild is handled transparently

Obviously it's not feasible to do parity RAID schemes in such a hybrid
setup.  If your primary goal of switching to mdraid was to increase the
performance of RAID6, then you simply can't do it with a single RAID
card *and* still have automatic drive failure management.  As they say,
there's no such thing as a free lunch.

If RAID6 performance is what you're after, and you want mdraid to be
able to handle the drive failure/replacement automatically without the
HBA getting in the way, then you will need to switch to non-RAID HBAs
that present drives in JBOD/standalone fashion to Linux.  LSI makes many
cards suitable for this task.  Adaptec has a few as well.  They are
relatively inexpensive, $200-300 USD, models with both internal SFF8087
and external SFF8088 ports are available.  Give me the specs on your
Proliant, how many drives you're connecting, internal/external, and I'll
shoot you a list of SAS/SATA HBAs that will work the way you want.

> But i have another problem, hpacucli don't work with all kernel version !
> To avoid details, i show you my results :
>
> 2.6.32.5-amd64 : OK
> 2.6.38-bpo.2-amd64 : NOK
> 2.6.39_bpo.2-amd64 : NOK
> 3.2.0.0.bpo.1-amd64 : NOK

This is very common with proprietary vendor software.  They have so many
distros to support that they must limit their development and
maintenance efforts to a very limited number of configurations, and
kernel versions.  When you look at RHEL kernels for instance, they never
change major numbers during a release lifecycle.  So you end up with
things like 2.6.18-274.18.1.el5.  This is what is called a "long term
stable kernel".  Thus, when a vendor qualifies something like a RAID
card driver or management tools, for RHEL 5, they don't have to worry
about their software breaking as Red Hat updates this kernel over the
life of the release, with things like security patches etc.  This is the
main reason why RHEL and SLES are so popular in the enterprise
space--everything 'just works' when vendor BCPs are followed.

To achieve the same level of functionality with Debian, you must stick
with the baseline kernel, 2.6.32-5 and security updates only.

Welcome to the world of "enterprise" hardware.

--
Stan


Reply to: