Hi folks, I've got a bit of a problem with one of my servers and I was wondering if anyone could offer a suggestion. I've got a slink system with a 2.2 kernel, and I'm getting lots of page faults, and kernel errors, and so on. My kern.log file is full of errors, sometimes happening every couple of minutes. Very rarely does it actually bring down the server, but I'm sort of worried about random things dieing all of the time! I also can't compile anything - most of the time I get (seeminly random, different each time) errors in ./configure scripts. My first though was that it was the RAM. I loaded up the memtest86 through lilo, left it running for quite some time, and didn't get even the hint of an error. My suspicion now is that maybe it is the swap partition. We had a bit of a crash on this drive a while ago (turned out to be a dodgy motherboard) which did some nasty things to the file system (which I recovered). I'm thinking that maybe some bad blocks have appeared in the swap space, and when the kernel tries to access them, it has an anurism (sp?). Does that sound like a plausible theory? Acting on that hunch, I tried disabling the swap, booting to single user mode, and deleting, then re-creating, the swap (using fdisk, then mkswap). This doesn't seem to have changed anything dramatically. Should I try running badblocks over the swap (can you even do that?)? Any suggestions would be greatly appreciated. cheers, damon -- Damon Muller (dm-sig6@empire.net.au) / It's not a sense of humor. * Criminologist / It's a sense of irony * Webmeister / disguised as one. * Linux Geek / - Bruce Sterling
Attachment:
pgp6yi_LlJfC8.pgp
Description: PGP signature