Re: open issues with the hppa port
On Thu, Sep 10, 2009 at 12:10:28PM -0400, Carlos O'Donell wrote:
> On Tue, Sep 8, 2009 at 11:53 PM, dann frazier<firstname.lastname@example.org> wrote:
> > We have been running with UP kernels for quite some time, and they
> > haven't proven to be any more stable. Most recently I've upgraded
> > peri/penalosa to 2.6.31-rc6-based kernels since they were inclusive of
> > the various changes I was pointed to on this list (thanks John/Helge).
> Where exactly did you get this kernel, do you have a URL reference?
> > peri has been surprisingly stable - uptime of 2 weeks so far, and it
> > seems to be under pretty steady build load.
> That sounds great.
> > penalosa is a different story - it has been very unstable with uptimes
> > of several hours at most. The hardware/kernel packages are identical
> > to that of peri (afaict), so I'm not sure why. The failure mode
> > results in infinite panics being printed to the console - but every
> > time I've seen it I haven't had enough console history to see the
> > beginning of this crash. I am now logging the console to see if I can
> > capture that. It is of course possible that penalosa is having
> > hardware problems - but I don't know of a way to prove this
> > conclusively. We could maybe swap disks to see if the failure follows
> > the disks or the hardware (though that doesn't eliminate a disk
> > problem).
> The way to prove this is to put an instrumented kernel on penalosa.
> I think the way forward is:
> * You get me a console trace.
After about a week of loaded uptime, penalosa returned to instability
over the weekend.
It began with an Illegal instruction leading to a panic:
I then rebooted it, and the console was hung immediately after boot:
Note the rc.local segfault there.
I rebooted it again, and it crashed after only a few minutes:
peri (identical hardware/kernel) has been up for about 20 days now.
> * I give you an instrumented kernel/initrd.
> * Repeat.
> Are you allowed to boot a kernel/initrd that I send you?
> > Note that I don't monitor the build output, so I don't know if we're
> > still seeing the same level of random segfaults in userland.
> > LaMont?