[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: The SpedStep crash problem ...



> > I did find one thread that indicates that ACPI support is a factor...
> 
> I tried it, and ACPI didn't help.  I compiled a new kernel, set my
> laptop to automatic mode and booted up at full speed.  I was able to
> run glxgears (which does a good job of maintaining 100% cpu usage) for
> about 5 minutes before the cpu overheated and shut down; it never went
> to low power mode (the fan was operating and blowing out some very
> warm air).  I rebooted with speedstep disabled, and ran glxgears for
> just under an hour, or until I was satisfied that I could run 100% cpu
> usage for an arbitrary (or at least a very long) period of time.

When I worked at Tuxtops, although we had non Speedstep CPUs available, 
they were certainly the more popular (since you couldn't get an 800 MHz CPU 
in a laptop without the darn stuff, at the time).  However, our stress test
for every single box that headed out the door was to run CPUburn on it for 
awhile and get them nice and toasty.  K6's would always die in a few hours 
(but those were the cheap seats, we were testing that it would properly defend
itself) ... every other box could run overnight and they got plenty warm,
anywhere from "I hope that thing's fan is running" to able to fry a hot dog,
but they didn't crash.  We didn't always have the leisure to put them through
overnights, but still, your system is being consistently wicked.

We didn't use ACPI at all (hey, this was almost a year ago... we could
experiment a little, but we couldn't tell customers that would just work)
so that's certainly not it alone.

So I have to wonder if there isn't something else in the equation giving
your system fits.

On the general point, Many device drivers depend on the BogoMIPS setting,
which is calculated at boot time, and most of them that use this, are 
concerned with getting a minimum amount of wait time - in some value real
enough at their tiny scale, and that doesn't require a context switch to 
ask about.  So the bigger number that you get from booting at high speed,
will result in a longer delay, but at least usually -safe-, when you are 
at a lower speed.  This has resulted in the common wisdom being to turn it
on when connected to good quality power, etc. and live in suspend after 
that.

However anything which is using the BogoMIPS in hope of calculating an -exact-
time, or a "no longer than" value, may not get what it would have hoped for.
If the speed difference is too great, it may be worse.  I read about, but 
never needed and therefore haven't used, a utility to force a BogoMIPS 
value recalculation.  Sadly I dunno what it is called.  Perhaps that will be 
enough to search for it, and wiring up such recalcs into your apm system (on 
battery and suspend/resume events) will help you stop crashing.  In your 
place, I would see what device drivers I have loaded and see if I can pinpoint
one as being the fragile link in this chain.  If not, well, I tried.

* Heather Stern * star@ many places...



Reply to: