[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Seg 11 in GCC



The manifestation of memory problems can easily be a function of usage
patterns.  If you change your usage, then a trouble-free process may
show error messages, or a problematic process may suddenly work.  That
does not remove the possibility of a memory problem.

Note that a bad simm may only show up bad if a particular pattern
happens to show up in certain of its bits.  Also note that there are
some (from a surface-perspective) non-deterministic interactions between
a machine's RAM, and its virtual memory system.  If you have a page in
RAM during one compile, and a page in swap during another, the error
could conceivably show up in the first, but not the second.

That's not to say that the signal 11 problem is definitely hardware -
I've seen no argument compelling enough to make me conclude 100% either
way, hardware or software.  Bad RAM does seem to be the stronger
hypothesis, IMO, tho.

Michael Alan Dorman wrote:
> 
> In message <[🔎] m0u2NcT-00063bC@mongo.pixar.com>, Bruce Perens writes:
> >From: Glenn Bily <gb2187@wcuvax1.wcu.edu>
> >> I have a hard time believing that RAM goes bad as much as you guys/gals
> >> claim. Nor do I really believe this would happen with modern machines.
> >I've seen these failures in my own system and have diagnosed them as being
> >RAM related. They went away when I changed the RAM.
> 
> With all due respect to the many (many!) people here who know more
> about gcc/linux/Unix/whatever than I, I feel obligated to say that
> while occasional SIG11 problems may have been RAM related (and I do
> have some experience with memory problems---I've upgraded Atari STs by
> hand-soldering DIPs piggy-back on existing memory.  Not somerwhere you
> want a cold-solder joint), the current trend of automatically
> classifing every single SIG11 as indicative of bad memory is simply
> hogwash.  I base my statement on my experiences of 3/28 (yesterday).
> 
> I've got a brand new P150 here next to me, with 64MB RAM, a DPT F/W
> SCSI-2 controller all running off a Fujitsu Fast (not Wide) SCSI-2
> disk.  I installed the Debian base system, and just enough to compile
> a kernel, and then I tried to recompile.  I got a SIG11s when trying
> to compile conmakehash.c (during make dep).  I reinstalled gcc,
> conmakehash compiled fine, but I got SIG11s and vm errors.  I
> increased my swap space from 8MB (top claimed it wasn't swapping, why
> would it need more), and I was down to just SIG11s.
> 
> Then I compiled a 1.3.80 kernel on another machine, reinstalled Debian
> using 1.3.80, and can now do 'make -j zImage' (and there are few
> things more amusing than watching top while this is happening, as your
> entire screen will just _fill_ with gcc processes).  I have done this
> upwards of 20 times (11 minute kernel compiles are fun, too) in the
> last 24 hours.  Never seen a SIG11.
> 
> So, I've not changed the hardware, and I'm excercising it more than I
> was previously (keeping the load above 6, mostly), and yet I see no
> SIG11s, even during the parallel compilations.  That would tend to
> cast significant doubt on the common assertion that SIG11 = hardware
> problem, no?
> 
> Mike.
> --
> "Don't let me make you unhappy by failing to be contrary enough...."



Reply to: