[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Hurd stacks, some more (was: RFC: ruby1.9.1 FTBFS)



Hi!

I'm looking at Debian's ruby1.9.1-1.9.3.194 sources.  Please tell if
you're addressing different sources.


I notice that USE_SIGALTSTACK seems to be defined for us (please
check/confirm).  Are there any visible functionality issues due to that?
(I would assume so.)  If there are no such tests that trigger usage of
the relevant code paths, can you create one?


On Wed, 3 Jul 2013 19:53:35 +0200, Gabriele Giacone <1o5g4r8o@gmail.com> wrote:
> attached patch fixes ruby tests (10 of 941).
> 9 are thread-related and fixed by setting stack size and first stack
> address to PTHREAD_STACK_DEFAULT. More info about threadvars at [1].

> [1] http://darnassus.sceen.net/~hurd-web/open_issues/glibc/ - search
> for makecontext

> --- a/cont.c
> +++ b/cont.c
> @@ -47,8 +47,12 @@
>  #define RB_PAGE_SIZE (pagesize)
>  #define RB_PAGE_MASK (~(RB_PAGE_SIZE - 1))
>  static long pagesize;
> +#ifdef __GNU__
> +#define FIBER_MACHINE_STACK_ALLOCATION_SIZE  (0x200000)
> +#else
>  #define FIBER_MACHINE_STACK_ALLOCATION_SIZE  (0x10000)

IRC, OFTC, #debian-hurd, 2013-06-23:

    <gg0>   #=> killed by SIGABRT (signal 6)
    <gg0> | *** makecontext: a stack at 0xb7000 with size 0x10000 is not usable
      with threadvars
    <gg0> 7/10 ruby failed tests
    <gg0> introduced by eglibc 2.17-6, tschwinge patch?
    <youpi> possibly, yes
    <tschwinge> gg0: Yeah, now that *context functions are available, the Ruby
      (?) code needs to be taught with are the requirements/restrictions
      currently for Hurd systems (stack size).

The fate of an incomplete implementation...  We now get to patch various
upstream packages, because they detect we now got the *context functions
implemented (but don't and can't be expected to detect that our
implementation is a bit picky)?  Hmm...

Definitely we should keep a list of such patches, so they can later be
reverted, once we got a more compliant implementation of the *context
functions.  (And likewise for sigaltstack.)

> @@ -561,7 +565,11 @@ fiber_machine_stack_alloc(size_t size)
>  	void *page;
>  	STACK_GROW_DIR_DETECTION;
>  
> +#ifdef __GNU__
> +	ptr = mmap(size, size, PROT_READ | PROT_WRITE, FIBER_STACK_FLAGS, -1, 0);
> +#else
>  	ptr = mmap(NULL, size, PROT_READ | PROT_WRITE, FIBER_STACK_FLAGS, -1, 0);
> +#endif
>  	if (ptr == MAP_FAILED) {
>  	    rb_raise(rb_eFiberError, "can't alloc machine stack to fiber");

I wonder how/why that works.  :-)

What you need is a region of memory to be used for the stack that is
aligned per the size variable -- but not necessarily mapped to the
address as indicated by the size variable (though, that happens to have
the proper alignment, too).  As it happens, there is a high chance that
this address 0x200000 is indeed available for mmap in a GNU Hurd system,
but this is not a proper solution.  I'd rather have our mmap
implementation in glibc do the proper thing when the MAP_STACK flag is
set (the Ruby code does set it), and continue passing NULL for the
request address.  I wonder: if MAP_STACK is set, would it even be
reasonable for mmap to ignore the supplied length, and instead use the
one "proper" value, 0x200000?


> Remaining one tries to kill -USR1 a process and expects it killed
> after 1 sec. Patch increases wait time from 1 to 3, hurd is slow.

> --- a/bootstraptest/test_fork.rb
> +++ b/bootstraptest/test_fork.rb
> @@ -30,7 +30,7 @@ End
>  assert_equal 'ok', %q{
>    begin
>      if pid1 = fork
> -      sleep 1
> +      sleep 3
>        Process.kill("USR1", pid1)
>        _, s = Process.wait2(pid1)
>        s.success? ? :ok : :ng

Using the sleep function for synchronizing parallel processes is not
exactly state of the art (but that is not your fault, of course) -- so
yeah, if that does fix the issue, why not propose this patch.


Grüße,
 Thomas

Attachment: pgpMVjTP4j2rj.pgp
Description: PGP signature


Reply to: