[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: SGI O2: Clock problem and Bind9 problem after upgrading to Lenny





--- On Tue, 2/24/09, Sébastien Vajda <sebvajda@gmail.com> wrote:

> > Someone reported a similar problem on a Qube2 but we
> other people
> > couldn't really reproduce it.  Do you have any
> special workload, or
> > will the machine hang/crash when you just do normal
> tasks or leave it
> > idle?
> 
> Yep, that was me. I was never able to pinpoint the cause of
> the problem.
> When the problem happened the Qube2 had been installed with
> etch then
> dist-upgraded to sid (lenny at that time).
> 
> I investigated as best as I could but to no avail. As a
> last resort I
> reinstalled from scratch using a (beta) lenny installer and
> the
> problem just disappeared.
> 
> Maybe something is broken by the upgrade from etch to lenny
> ?

I'm finding the last part of your comment interesting.  Last spring my R5K-200 O2 running the TESTING feed had a period of strange instability.  The machine had been rock solid for 6 months, then experienced a string of total lockups after I had re-synced with TESTING  The kernel package was not one of those updated as I was running my own custom build.  The lockup was truly at the kernel level as the box wouldn't even respond to ICMP over the network.  I even tried not running xorg, but still it would tank after a day or so.  I tried replacing ram sticks, the power supply, etc. with spares to no avail.  Finally, after about 6 weeks of that, I re-synced with TESTING again as I saw lots of updates had been pushed through.  Again, keeping the same kernel.  The problem disappeared completely.

My one and only suspect is that they were pushing through a significant version update with GCC at the time I started experiencing the problem.

While this isn't supposed to happen, it almost seems that using a mixture of libraries and apps built with differing levels of GCC has something to do with this issue.  Once the bulk of everything was built by one compiler version in that last update, the problem seemed to go away.

Coincidence or cause?  Don't know...  I find it interesting that when you did your own re-install, the problem went away also.

Cheers,

-S-


Reply to: