[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

RE: Hyperthreading problem with IRQ handling and scheduling



Hi Stan,

Thanks for your reply. I will bring this to the attention of the system
administrator (I have root access but I don't think they'll appreciate me
installing a new kernel on my own).

I've just discovered that a similar issue can also occur with pure compute
tasks (no I/O at all). If I run 8 of those in parallel, some of them will
run on logical CPUs of the same physical CPU, and those are slower than the
ones that get a physical CPU to themselves. Since there are enough physical
CPUs available, I don't believe the scheduler should do this. Hopefully
upgrading the kernel will resolve this as well.

Thanks,
Sven

-----Original Message-----
From: Stan Hoeppner [mailto:stan@hardwarefreak.com] 
Sent: vrijdag 4 maart 2011 16:10
To: debian-user@lists.debian.org
Subject: Re: Hyperthreading problem with IRQ handling and scheduling

Sven Groot put forth on 3/3/2011 11:28 PM:

Hello Sven,

> I am using a cluster of machines running Debian 5.0.4, kernel 
> 2.6.26-2-amd64. These machines have dual Intel Xeon E5530 2.4GHz CPUs, 
> which are quad-core CPUs with hyperthreading. So that means each 
> machine has 8 physical CPUs and a total of 16 logical CPUs.

> I have run into an apparent issue with the kernel scheduler. Under the 
> circumstances described below, the scheduler will run two tasks on two 
> logical CPUs of the same physical CPU, even if all the remaining 
> physical CPUs are idle. This obviously causes a large slowdown for these
tasks.

<snip>

Two things.

First, you're running Debian kernel 2.6.26 which, IIRC, doesn't have all the
scheduler patches required for both mutli-core and HT support, or simply
doesn't have them all enabled, which is the cause of your problem.  The
following must all be set.  You need a new kernel.

CONFIG_SCHED_SMT
CONFIG_SCHED_MC

1.  Install the latest Debian prepackaged lenny-backport kernel on each
cluster node:  linux-image-2.6.32-bpo.5-amd64_2.6.32-30~bpo50+1_i386.deb
http://backports.debian.org/Instructions/

If the nodes don't have direct internet access, preventing installation via
apt-get or aptitude, then download the .deb package, copy it to each machine
via scp/ftp/nfs/etc, and install it via dpkg:
dpkg -i
/full/path/to/linux-image-2.6.32-bpo.5-amd64_2.6.32-30~bpo50+1_i386.deb

I've never installed a backport package directly via dpkg.  You may need an
additional switch or two.  Others here can answer this.


2.  Download the 2.6.37.2 vanilla source from:
http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.37.2.tar.bz2
Follow the build instructions here:
http://kernel-handbook.alioth.debian.org/ch-common-tasks.html

to create a kernel image with the options and modules you need, and none you
don't, and to create a kernel deb package.  Copy the .deb to each cluster
node and perform:

dpkg -i /full/path/to/linux-image-2.6.37.2-custom.1.0_amd64.deb


Second, you may want to ask about this on lkml as well, as far more
expertise in this area of the kernel resides there.  Installing a new kernel
will solve the bulk of your problem.  To fine tune the performance per
core/thread afterward you'll need assistance from kernel devs on lkml.

Hope this points you in the right direction.

--
Stan


-- 
To UNSUBSCRIBE, email to debian-user-REQUEST@lists.debian.org 
with a subject of "unsubscribe". Trouble? Contact
listmaster@lists.debian.org
Archive: 4D709039.80700@hardwarefreak.com">http://lists.debian.org/4D709039.80700@hardwarefreak.com



Reply to: