[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

My Alpha is still unstable



	Hi!

First of all, thanks to all of you who responded so quickly.
As suggested by Stig Telfer, I got the patched kernel from
gatekeeper.dec.com, compiled it and installed. I also got
the patched SVGA XServer. It seems to have helped somewhat.
I don't get anymore 'Oops'es, and right after the machine
comes up it seems to work OK, X and all. But it didn't
survive the night. There seems to be a serious memory leek somewhere. I
cannot login from remote machines anymore. I
get an 'out of memory for ...' on the console every time I
try. Yesterday I deliberately left a session running on the
console (i.e. I didn't log out), so I was able to run top.
It said that I had only about 6 of 256 Megs of memory free
and there were no memory-hungry processes running. After
that I tried to 'shutdown -r', but there apparently wasn't
enough memory for that. I also tried Ctrl-Alt-Del, and init
ran out of memory. So I'm contemplating the off switch again.

Here is some more background: This machine worked fine and 
was quite stable before I got the task of finding out the
Processor ID or Motherboard ID or something like that. I was
told I needed the SRM console, so I replaced the ARC console
with it. That is when the machine started acting strange (I
think -- and by the way I still don't know how to get the
Processor ID). But at the same time I was constantly upgrading
Debian, so I wasn't sure about what went wrong.

Here is some hard data about the system:
The platform is AlphaPC164LX 533MHz, 256M RAM
Linux version 2.0.35 (root@cruncher) (gcc version egcs-2.91.57 19980901
(egcs-1.1 release)) #4 Thu Nov 5 17:52:45 CET 1998

To answer Loic, I don't know if I used the -fmno-fp-regs option for
compiling the modules. There is no such flag in
the toplevel Makefile (which has been patched with the patch
#2 from gatekeeper). Am I supposed to add it to MODFLAGS 
or something?

Now, what else can I look at? Could one of the daemons cause
trouble? The sistem is running NIS (client), autofs, lpd,
sshd, ... Below I'm adding snips of the 'messages' log file.
There is much more strangeness in the logs, but this post is 
too long already.

	thanks,

	feri.





The log files show '-- MARK --' until cron started checkerr:

Nov  6 06:38:06 cruncher -- MARK --
Nov  6 06:43:24 cruncher logger: Cron job - running checkerr as mail 
Nov  6 06:43:24 cruncher kernel:  
Nov  6 06:43:24 cruncher kernel: Out of memory for checkerr. 
Nov  6 06:43:24 cruncher kernel:  
Nov  6 06:43:24 cruncher kernel: Out of memory for checkerr. 
Nov  6 06:43:24 cruncher kernel:  

after this there are more messages saying 'Out of memory ...'
for varous services, until later on, when I tried to remotely
log in:

Nov  6 09:55:55 cruncher kernel: Out of memory for login. 
Nov  6 10:05:13 cruncher kernel: bash: memory violation at pc=00000000
rp=15555a69680 (bad address =
 00000000) 
Nov  6 10:05:13 cruncher kernel: bash: memory violation at pc=00000000
rp=120057324 (bad address = 0
0000000) 
Nov  6 10:05:13 cruncher last message repeated 1402 times
Nov  6 10:05:13 cruncher kernel: bash: memory violation at pc=00000000
rp=12003f304 (bad address = 0
0000000) 
Nov  6 10:05:13 cruncher kernel: bash: memory violation at pc=00000000
rp=120057324 (bad address = 0
0000000) 
Nov  6 10:05:14 cruncher last message repeated 5906 times
Nov  6 10:05:15 cruncher kernel: n at pc=00000000 rp=120057324 (bad
address = 00000000) 
Nov  6 10:05:15 cruncher kernel: bash: memory violation at pc=00000000
rp=120057324 (bad address = 0
0000000) 
Nov  6 10:05:15 cruncher last message repeated 168 times
Nov  6 10:05:15 cruncher kernel: kfree of non-kmalloced memory:
fffffc000f1c6f28, next= fffffc000f2d
2000, order=1 
Nov  6 10:05:15 cruncher kernel: kfree of non-kmalloced memory:
fffffc000f5322a8, next= fffffc000f1c
6000, order=1


Reply to: