[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: computer hangs, how do I go about diagnosing?



<quote who="Paul Reavis">

> It is a sid install, on newish hardware (athlon 1.4G) with a 2.4.14
> kernel. I updated to latest sid yesterday.

newish as in brand new? has the system run anything
else before the problems occur with no problems ?


> Could someone clue me in on this, or at least point me towards
> other troubleshooting/diagnostic tools I could try? Thanks.

while im by no means a kernel hacker what i see in the logs
looks to be memory faults, if thats the case then ...

- none of the components in the system are overclocked?
- feel the sides of the system, is it warm? if its warm to
the touch its usually too hot.
- try a kernel stress test while not running X:
-- little script--
#!/bin/bash
while true
do
cd /usr/src/linux
make clean ; make dep ; make -j bzImage
echo "kernel compiled on `date`" >>/root/kernel-loop.log
done
--end script--

im doing that on a freebsd machine at this moment actually.
i had some problems with crc errors, had the vendor "fix"
the system, and compiled 1500 kernels on it over a period
of a few days. i have another system that im testing now.
Note you need gobs of memory to do a make -j. I did it
on a redhat 7(this freebsd machine was redhat7 when i
ordered it), and # of running processes spiked to about 220,
memory usage went way up, load went to about 40. It is
a P3-1Ghz with 512MB ram. If you have insuffient resources
you can try make -j 5 or make -j 2.

have you tried any other kernels? if your hardware permits
i can't help but suggest using kernel 2.2 and see if
it has the same problem(theres a lot of bugs like the
one your experiencing in past 2.4.x kernels though
i haven't seen many mentioned on the recent 2.4.10 and
newer). all of my linux systems run 2.2, 2.4 is deemed
to unstable for my use for another year.

If you have enough ram, try turning off SWAP and see
if the problem persists, i wouldn't reccomend X with
no swap unless you have at least 256MB ram. crashing
due to out of memory wouldn't make a good test :)
maybe theres bad blocks on the disk in the swap
partition causing problems ..

What speed RAM do you have? my desktop at work used
to lock up randomly too. The vendor swore that the
ram was 133Mhz when infact it was about 127Mhz(according
to a co worker..) I used the bios(Asus CUV4X) to force
the ram to 100Mhz, and i have not had a crash in over
a year. infact the machine has an uptime of 209 days
currently.

if all that fails to turn up anything the only other
thing i can reccomend is reconfigure the system to
a minimum amount of experimental software. that may
mean taking out hardware that has beta/alpha drivers,
and replacing it with more known solid hardware(video
cards especially since your using X), try a 2.2 kernel
on debian potato and see if the problem persists.

another decent burn in test is seti@home. on new
systems i used to run 10x copies of it at the same
time, it would cause severe disk accesses(esp
if the systems only had 128MB ram) for swap, memory
and cpu usage would go through the roof. console
would almost be unusable the system would be so
slow..but if they emerged without a problem they
were certified ok by me.

memtest86 is good too, you mentioned you ran that
already. how long did you run it for? in my
testing it took about 12 hours to test 768MB
of ram on a 1.3ghz athlon. when i had MB problems
earlier in the year(i reccomend against using
the Asus A7A266), i thought it could be ram,
so i ran memtets86 for about 5 days straight
and not a single problem. had about 12 passes
on the memory during that time.

if your still having problems include a detailed
report on system configuration including hardware,
software, driver versions, bios settings, disk
partition settings etc.

hth

nate





Reply to: