[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: etch on aranym, was Re: [buildd] Etch?




On Wed, 16 Aug 2006, Petr Stehlik wrote:

> Finn Thain wrote:
> 
> > > In the following message (Feb 2006) I explain how weird it behaves: 
> > > https://lists.bobek.cz/pipermail/cz-bobek-lists-aranym/2006-February/008657.html
> > 
> > That thread didn't seem to generate much sympathy... the aranym devs 
> > probably need a workload that consistently triggers the bug? Is it 
> > difficult to reproduce the bug?
> 
> It's kinda random.

In that case, it might be necessary to make the scheduler behave in a more 
derministic way (maybe realtime priority?). Single-user mode would help. 

I'd create a script, say /root/crash.sh, make it executable, and boot the 
kernel with "init=/root/crash.sh". In crash.sh I'd run some 
single-threaded stress tests.

http://samba.org/ftp/tridge/dbench/README
http://weather.ou.edu/~apw/projects/stress/
http://www.bitmover.com/lmbench/

If you can reproduce the problem that way, I'd boot a real atari with the 
same disk image & kernel, and make sure that didn't crash. Then I'd give 
the disk image to the aranym devs so that they could capture the 
instruction stream leading up to the crash and hopefully debug it.

If you can't reproduce the problem that way, I'd try introducing more 
context switching into the workload.

> I have just ran the upgrade from sarge to etch and at one point the 
> installation failed. When I was inspecting why I found out that one file 
> deep in perl setup had the first character changed from '#' to 's' so it 
> looked as follows:
> 
> s!/usr/bin/perl
> 
> I edited that file using 'vi' and replaced the 's' with '#', saved and the
> installation continued normally.
> 
> So let's say that either CPU cannot unpack file properly (under some
> conditions, perhaps related to MMU) or that the disk can damage files when
> storing (again under some special conditions).

Are you sure the problem was not confined to the buffer cache? Re-reading 
the same file after an unmount/remount would determine that. Actually, 
that reminds me of a problem I had with a North Bridge chip that only 
tripwire was able to detect. Continuous tripwire tests could be a good 
workload to try.

> And it's completely random so it's triggered by some interrupt, most 
> probably. And it does not happen under other operating systems otherwise 
> users would already report that.
> 
> > In that thread you have a link to aranym-linux-kernel-2.4.27-12.tar.gz. Is
> > the latest kernel not working on aranym?
> 
> 2.4.27 was the 2.4.x kernel for Atari at the time when I was playing with that
> (September 2005). If there is a newer 2.4.x kernel then it should work.

>From "should work", I take it that no out-of-tree patches are required... 
but which tree should work? Are you referring to mainline, linux-m68k or 
debian kernel?

-f

> 2.6.x kernels don't boot on Atari, AFAIK.
> 
> Petr
> 
> P.S. I CC:ed also aranym list since this is an aranym-specific issue. Further
> discussion should continue there, I think.
> 
> 
> 



Reply to: