[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#584881: Lockups under heavy disk IO; md (RAID) resync/check implicated



Ben Hutchings writes ("Re: Bug#584881: Lockups under heavy disk IO; md (RAID) resync/check implicated"):
> On Fri, 2010-06-25 at 11:50 +0100, Ian Jackson wrote:
> > No, I think there are two meanings of the word "barrier".  AFAICT md
> > has its own thing which it confusingly calls a "barrier"; it can be
> > "raised" and "lowered".
> 
> Oh, great!  I wondered whether this was the case but I could only find
> discussion of md vs I/O barriers.  Do you have any reference for
> documentation of md barriers?

No.  I just stumbled across them in the source.  Particularly this in
drivers/md/raid1.c:

/* Barriers....
 * Sometimes we need to suspend IO while we do something else,
 * either some resync/recovery, or reconfigure the array.
 * To do this we raise a 'barrier'.
 * The 'barrier' is a counter that can be raised multiple times
 * to count how many activities are happening which preclude
 * normal IO.
 * We can only raise the barrier if there is no pending IO.
 * i.e. if nr_pending == 0.
 * We choose only to raise the barrier if no-one is waiting for the
 * barrier to go down.  This means that as soon as an IO request
 * is ready, no other operations which require a barrier will start
 * until the IO request has had a chance.
 *
 * So: regular IO calls 'wait_barrier'.  When that returns there
 *    is no backgroup IO happening,  It must arrange to call
 *    allow_barrier when it has finished its IO.
 * backgroup IO calls must call raise_barrier.  Once that returns
 *    there is no normal IO happeing.  It must arrange to call
 *    lower_barrier when the particular background IO completes.
 */

Ian.



Reply to: