[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Diagnosing faulty hardware



James Foster wrote:

I believe this is most likely a hardware problem.
Generally, the system is capable of staying up, although it has locked
completely once or twice.


You can "apt-get install memtest" to test your RAM. (It'll install a stanza in /etc/lilo.conf, and then you just boot into memtest instead of into Linux.)

You can also take a stick of RAM out (assuming you have multiple sticks) and run your machine a while; see if the lockups go away. Then swap that stick for one you took out, etc.

Open the box and check the fans. Install lmsensors to monitor your temperatures. Look in your BIOS for a setting to drop the CPU clock speed for a few days to see if the problem goes away.

Don't forget power issues; get a cheapie UPS ($40) to make sure that brownouts aren't killing you. (Have you noticed your lights dimming briefly when the fridge kicks into gear, etc?) Do you have an AC-generating device near your CPU, such as a flourescent light, or a sonic bug-chaser, etc? (You can physically move the computer to another room for a week to see if the problem goes away, in order to check for environmental issues like this.)

Are you running out of space on one of your partitions, such as /tmp or swap?

You might try running your machine off a Knoppix CD for a few days; if the problem still exists, you know it's either hardware or environmental; if the problem goes away, you know it's in your software setup.

--
Kent



Reply to: