[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Problems with PAE kernel sort of solved.



Frank McCormick wrote:
> Camaleón wrote:
> > Frank McCormick wrote:
> >
> >>I have been having problems with the new series of PAE kernels. I could
> >>never get them to boot on my machine (see bug #632734) .

You early adopter you.  I haven't rebooted my machine to the new
kernel today yet.  :-)

Personally I am using the 64-bit amd64 kernel and so I won't be able
to verify whether PAE is a problem with the new kernel or not since
PAE is only relevant to 32-bit kernels.

> >>This morning I installed the "new" 3.0.0 kernel, and spent a half hour
> >>changing BIOS settings in an attempt to get it to boot.

I assume that you could select the previous kernel and boot okay?  It
would be a good thing to double blind test.  That would be a good A-B
comparison test.  Because it is possible that the old kernel fails to
boot now.

> >>Nothing worked until I turned OFF the hyper-threading option. The
> >>kernel now boots fine, but the system does not see it as a
> >>dual-core but as one CPU.

That is odd since hyperthreading really has little to do with
multi-core cpus.  It is orthogonal to it.  In many ways HT is an Intel
marketing breakthrough since it is not a feature of AMD processors so
if consumers ask for hyperthreading the sales force will direct you to
an Intel cpu.  Otherwise it has little advantage.

> >>The problem is the system **seems** significantly slower than it was..a
> >>costly trade-off to run the new kernel. I don't see the connection
> >>between the PAE option the kernel now uses (and which my dual core CPU
> >>supports) and hyper-threading. Can anyone enlighten me ??

I had been using the PAE kernel on other systems with 32-bit kernels
and 6G of ram for a lot of years and while it is a very small amount
slower than a non-PAE kernel it isn't noticeable slower and the extra
ram in the system was useful.  I had to run benchmarks to quantify it
at a few percent.  I forget now off the top of my head and don't feel
like digging out the now quite old data.

> >Mmm... not sure if this will answer your question but as I understand, HT
> >is the hardware part while SMP is the logical/software part you need to
> >"double" your microprocessor. You need from both to get the job done so
> >when you disable HT in BIOS, is the same that if you had installed a non-
> >SMP kernel.
> >
> >Well, sort of :-)

Not quite.  There are good descriptions of HT on the web and I defer
exactness and details to those but basically with HT the cpu registers
are duplicated but not the execution unit.  The execution unit is the
heart of the cpu and you still only have one of those.  HT creates a
"fake" dual-core by switching between two banks of registers.  It
isn't a real dual-core.  You will not double your performance.

HT enables the processor state to be switched between two banks of
processor registers.  This makes context switching between two threads
very quick because no context needs to be saved or restored but two
threads cannot be running simultaneously.  One execution thread is
paused while the other is running.  The effect is a single cpu with a
small benefit by having a larger number of registers in use.

A single-core cpu with HT enabled will be seen as two cpus.  A
dual-core cpu with HT enabled will be seen as four cpus.  The first
cpu will have two HT cores and the second cpu will have two HT cores.
Enabling or disably HT in the BIOS should be completely independent of
whether the machine is SMP or not.

My benchmarks show that a single core machine gains some very little
benefit from having HT enabled.  For me it was 2%-3%.  Nothing to get
too excited about.  But for a dual core machine having HT enabled was
about 5% slower than with it disabled.  It seems that the overhead of
the kernel to tracking threads on fake processors was larger than the
advantage gained by having a larger register set.  (You don't want to
put two processes on the first cpu and none on the second because then
the tasks will actually run at half speed.  Linux 2.4 didn't track
this but Linux 2.6 does to avoid the problem.  It creates accounting
overhead in the kernel.)

In summary for me I always disable HT on multi-core machines.  It
gives me a small performance increase.  For single core machines it
doesn't really matter and so I tend to leave it enabled.

Note that the old Linux 2.4 kernel did not understand HT and would
timeslice across all visible cpus.  For a single-core with HT making
two cpus visible it really wasn't a problem.  But this created
terrible problems on multi-core machines and HT always needed to be
disabled on those for the old 2.4 series kernels.  Otherwise one cpu
would have too much work and the other not enough.  Since 2.4 really
isn't too relevant today I won't say more about it now.

>   Yes, that I guess is why the system now "sees" only one CPU when I
> have a dual core.

Did you *really* have a dual-core?  Or did you have a single-core with
HT enabled?  I think probably the latter based upon your description
so far.  In which case having HT off shouldn't be a big deal.  I would
need objective benchmarking to convince me otherwise.

> But I still fail to understand why turning off hyper-threading
> allows the kernel (supposedly who only major change is use of the
> PAE extension) to boot when it wouldn't before.  Anyway I guess I am
> barking up the wrong tree - the kernel developers seem comfortable
> with their assumption that it's a hardware fault on my
> machine. Could be, or maybe not.

The problem is that it is plausible.

Here is what I would do to get some more information.

* Benchmark on the new kernel to obtain some objective data on the
  performance.  (Something simple should be sufficient to start.  I
  would do something real but objective.  I would grab a package out
  of Debian and compile and build it.)

* Verify that the cpu really is a multi-core cpu and not just a single
  core with hyperthreading.

* Boot back to the previous kernel as a test to verify that the
  previous kernel still boots.

* Benchmark on the old kernel to obtain some objective data on the
  performance.

Bob

Attachment: signature.asc
Description: Digital signature


Reply to: