[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: OT? hypertread on or off for SMP kernel



On Thu, 2004-12-09 at 13:59, Paolo Alexis Falcone wrote:
> On Thu, 09 Dec 2004 13:15:00 +0000, michael
> <linux@networkingnewsletter.org.uk> wrote:
> > On Thu, 2004-12-09 at 12:53, Ron Johnson wrote:
> > 
> > 
> > > On Thu, 2004-12-09 at 11:48 +0000, michael wrote:
> > > > Hi - I've had a quick look on the WWW but can't find much of help. I've
> > > > a dual Xeon box running testing and kernel from backports:
> > > >     m@r:~$ uname -a
> > > >     Linux r 2.4.27-1-686-smp #1 SMP Fri Sep 3 06:34:36 UTC 2004 i686
> > > >     GNU/Linux
> > > >
> > > > The SMP kernel does indeed make use of the processors available: 2 if
> > > > hyperthreading (HT) is off and 4 if it is on.
> > > >
> > > > My questions are:
> > > > a) for MPI codes (mpich v1.2.6 compiled with Intel 8.1 compilers), under
> > > > which circumstances is better performance achieved with HT on or off?
> > > > I'm looking for quite a detailed analysis which, if hasn't already been
> > > > done, I can kick off if anybody's interested
> > > >
> > > > b) for general day to day running do people leave HT on or turn it off?
> > >
> > > How threaded are your apps?  Or, maybe, how many apps do you run
> > > at once?
> > >
> > > On Windows benchmarks, at least, performance of non-threaded apps
> > > drops when HT is turned on.
> > 
> > The apps in question are MPI Fortran/C codes and I can say at run time
> > how many processors (threads) I require. Thus for my dual Xeon I could
> > have HT off and go up to 2 processors. But is it worth switching HT back
> > on and going to 4 (logical) processors, that is my main question. I
> > believe the answer is "depends on the nature of your codes" and I'm
> > interested in hearing if anybody has studied this (eg with known memory
> > intensive, compute intensive, comms intensive code kernels (not linux
> > kernels!)). If not, I'm happy to kick this off as a low priority
> > project. I've also posed this question to comp.parallel.mpi so if people
> > are interested I can summarize any results/answers here.
> 
> Traditional SMP architecture features processors each with its own set
> of caches, execution resources, and buses. You won't have much
> problems if you don't saturate the bus by data transfers, don't have
> contention over the bus, and keep the CPU cores busy. This is
> achievable if you employ your machine as a computing workhorse and not
> as a memory-chewing machine.
> 
> On-chip simultaneous multithreading (aka hyperthreading as marketed by
> Intel), however, is constrained by having only one set of caches, one
> set of execution resources, and one bus, but you have two logical CPUs
> contending for those, each running its own different thread. A thread
> can hog most of the resources of your
> machine should the other thread choose not to use them.  This is why
> SMT generally screws up scheduling algorithms that work normally in
> traditional SMP systems.
> 
> In your case, however, it would seem that most of your processes are
> CPU-bound - in which case enabling SMT/HyperThreading would be very
> beneficial. IBM developerworks has this dated study but might be
> beneficial nonetheless:
> http://www-128.ibm.com/developerworks/linux/library/l-htl
> -- 
> Paolo Alexis Falcone
> pfalcone@gmail.com
> 

Paolo - thanks for that reference, just what I was hoping to find (why I
was looking on Intel's site I dunno!). I understand & agree with the
points you raise. I'll plough though the ref and see what I can glean.

However, for one simple test (v.low memory, v.low comms, high compute)
case which is Fortran90 compiled using mpich, I actually found turning
HT off helped -- when it was on then for -np 2 there was no speed up
(top implied they ran on same physical processor). I presume this was
because the CPU was fully busy anyhow so the 2nd thread had to wait (you
seem to imply the opposite but I don't follow why!).

I can see more testing of 'mpich' on HT is probably worth doing,
particularly for "real life" codes but it also strikes me that sometimes
it is beneficial to put the processes on different processors (thus
allowing HT for system and max return for MPI codes).

Cheers, Michael



Reply to: