[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Please give back ruby1.9/1.9.0.2-9 on hppa and alpha



On Thu, Feb 05, 2009 at 04:51:15PM -0700, dann frazier wrote:
> On Thu, Feb 05, 2009 at 09:00:43PM +0100, Helge Deller wrote:
> > dann frazier wrote:
> > > On Mon, Feb 02, 2009 at 07:04:48PM +0100, Lucas Nussbaum wrote:
> > >> ruby1.9 still fails to build on hppa and alpha.
> > >>
> > >> On hppa, it's caused by a kernel bug, which was partially fixed (at
> > >> least the kernel doesn't panic() anymore). Since the issue is related to
> > >> threading, it is possible that retrying could make it build
> > >> successfully.
> > > 
> > > fyi, I've retried it numerous times on both buildds with no
> > > luck. We're not crashing the buildd anymore - thanks to Helge's fix -
> > 
> > The kudos belong to James Bottomley btw. I did debugging and testing,
> > but James gave me the final hint to the solution then...
> > 
> > > but the build hangs indefinitely. I've no objection to it being
> > > retried again of course (and I'm not the buildd admin anyway) - I just
> > > want to set your expectations.
> > 
> > I tried a few times now to find the bug. I'm not sure if it's really due to 
> > a) a kernel bug (probably)
> > b) the fact that hppa still uses Linuxthreads (although Dann mentioned
> > in another mail that he saw similar problems with another server which
> > used NPTL instead of Linuxthreads)
> 
> Since I don't remember the last time I tried, I've started another
> build in my NPTL chroot running a fixed kernel to verify that I'm
> still seeing it.

And I am. Hangs at:

cc -fno-strict-aliasing -g -g -O2 -O2 -g -Wall -Wno-parentheses  -fPIC  -I. -I.ext/include/hppa-linux -I./include -I.  -DRUBY_EXPORT   -o dmyext.o -c dmyext.c
cc -fno-strict-aliasing -g -g -O2 -O2 -g -Wall -Wno-parentheses  -fPIC  -L.  -rdynamic -Wl,-export-dynamic   main.o dln.o dmyencoding.o miniprelude.o array.o bignum.o class.o compar.o complex.o dir.o enum.o enumerator.o error.o eval.o load.o proc.o file.o gc.o hash.o inits.o io.o marshal.o math.o numeric.o object.o pack.o parse.o process.o prec.o random.o range.o rational.o re.o regcomp.o regenc.o regerror.o regexec.o regparse.o regsyntax.o ruby.o signal.o sprintf.o st.o string.o struct.o time.o transcode.o util.o variable.o version.o blockinlining.o compile.o debug.o iseq.o vm.o vm_dump.o thread.o cont.o id.o ascii.o us_ascii.o unicode.o utf_8.o strlcpy.o strlcat.o  dmyext.o  -lpthread -lrt -ldl -lcrypt -lm   -o miniruby
./miniruby -I./lib -I.ext/common -I./- -r./ext/purelib.rb  ./enc/make_encdb.rb ./enc encdb.h.new


> 
> > C) wrong pthread coding in ruby1.9
> > 
> > If it's due to a) (kernel bug), then it's hard to find and track down.
> > I concentrated on b) and c) for now. LT uses a few signals to synchronize the
> > threads, and ruby plays some small but bad games with signals in it's code, e.g.
> > rb_disable_interrupt() and rb_enable_interrupt() in signal.c.
> > With the attached patch/hack below I tried to work around possible LT-related cornercases
> > in ruby1.9, but the issue stays the same: "make test" will make the ruby
> > testsuite hang in the "test_thread.rb" test. It seems some thread is waiting
> > for a signal which will not arrive, since the other thread is a zombie already....
> > 
> > Anyway, it would be nice if someone with ruby knowledge could reduce 
> > the testsuite, so that it will be easier to reproduce the bug. I'm a little
> > lost at this stage. Now since the hppa kernel doesn't crash any longer, building
> > such a testcase should be much easier to create.
> 
> 
> 
> 

-- 
dann frazier


Reply to: