[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: CAS (lws_compare_and_swap32)



2010/3/29 NIIBE Yutaka <gniibe@fsij.org>:
> I am currently investigating FTBFS of gauche (a scheme interpreter) on
> hppa.  My knowledge of hppa is quite limited, though.  I am not on
> this list.  Please send Cc: to me.
>
> I have a question of CAS implementation.  I assume uni processor
> system.
>
> I am looking at:
>        linux-source-2.6.30/arch/parisc/kernel/syscall.S
>
> ----------------
>        /*
>                prev = *addr;
>                if ( prev == old )
>                  *addr = new;
>                return prev;
>        */
> [...]
> cas_action:
> [...]
>        /* The load and store could fail */
> 1:      ldw     0(%sr3,%r26), %r28
>        sub,<>  %r28, %r25, %r0
> 2:      stw     %r24, 0(%sr3,%r26)
> ----------------
>
> Suppose that <addr> points to copy-on-write memory.  At the label 2,
> storing data to <addr> will invoke memory trap and it will go to
> do_page_fault() to get new memory.  In this scenario, is there a
> possibility for the process to be scheduled off?
>
> Call chain in question is:
>        do_page_fault()
>          ->..-> do_wp_page()
>            ->..-> __alloc_pages_internal() with GFP_HIGHUSER_MOVABLE
>              ->..> schedule()
>
> linux/gfp.h has the definition:
> #define GFP_HIGHUSER_MOVABLE    (__GFP_WAIT | __GFP_IO | __GFP_FS | \
>                                 __GFP_HARDWALL | __GFP_HIGHMEM | \
>                                 __GFP_MOVABLE)

I wrote the LWS CAS implementation.

At the time I wrote it I tried to verify that the process calling the
CAS could never sleep, since this would make it non-atomic.

There are checks in entry.S to prevent return code-paths from
scheduling or delivering signals if the process was executing on the
gateway page.

If we are certain that the above could happen, then a possible solution is:
* Enable locks for SMP and UP.
* If lock is taken for your addresss, return to userspace with EAGAIN.
* Userspace yields on EAGAIN and then tries again (we can't use
FUTEX_WAIT/FUTEX_WAKE on a global process unique variable because LWS
CAS is expected to work on shmem).

Do we really think the above can happen?

Cheers,
Carlos.


Reply to: