[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: etch on aranym, was Re: [buildd] Etch?



Finn Thain wrote:
difficult to reproduce the bug?
It's kinda random.

In that case, it might be necessary to make the scheduler behave in a more derministic way (maybe realtime priority?). Single-user mode would help.

I could try upgrading the sarge to etch in single-user mode to see if it changes something.

I'd create a script, say /root/crash.sh, make it executable, and boot the kernel with "init=/root/crash.sh". In crash.sh I'd run some single-threaded stress tests.

http://samba.org/ftp/tridge/dbench/README
http://weather.ou.edu/~apw/projects/stress/
http://www.bitmover.com/lmbench/

FYI, I have just finished the following test:
# stress -c 4 -i 16 -m 3 --vm-bytes 32M -d 4 --hdd-bytes 128M

It's been running for almost 5 hours. No problem detected. On another console I ran while(true) do uptime; sleep 300; done and saw a consistent load of 28-29. So the machine was busy stressing CPU, memory and disk but it didn't detect anything wrong.

If you can't reproduce the problem that way, I'd try introducing more context switching into the workload.

like stress -c 1k instead of -c 4?

s!/usr/bin/perl

Are you sure the problem was not confined to the buffer cache?

I am not sure at all.

Re-reading the same file after an unmount/remount would determine that.

will try the next time.

Petr



Reply to: