[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#689861: Issues with Xen when all CPUs are available to dom0



On Sun, 2012-10-07 at 12:52 +0200, Peter Viskup wrote:
> Package: linux-image-2.6.32-5-xen-amd64
> Version: 2.6.32-45
> 
> I am experiencing issues with Xen once all CPUs are available to dom0. 
> There is high "steal time" shown once I do not set one CPU available for 
> dom0 (doesn't matter what way is used - xend-config, Linux or Xen 
> hypervisor boot argument).
> If all CPUs are available to dom0 all tries to start domU fail with 
> timeouts. More detailed description is in xen-utils bugreport opened by 
> me in July 2012 [1] with no response till today. It is reproducible on 
> two different servers running Xen (one Intel Xeon and second AMD Opteron).
> Please consider if it is related to kernel or not.
> Anyway - we just jump into situation where no dynamic domU's 
> configuration change is possible and this is causing us serious 
> manageability and serviceability issues.

I'm afraid I don't have any particularly dazzling insights here. One
thing you could try is asking on the upstream xen-users@ list in case
someone else has seen this, although it doesn't ring any bells for me.

Another experiment might be to try the wheezy hypervisor and/or kernel
packages.

The stolen time thing is weird, since that is time spent where the VCPU
could run but is not because another VCPU is scheduled -- but if you
can't start any guests then there is nothing to compete against. It
might be interesting to investigate a little where all the CPU time is
going, firstly using top to check for rogue processes in dom0 and then
xentop to look for rogue VCPUs. Pressing 'd' on the xen debug console a
few time ("statistical sampling") might give also give a clue where the
physical CPUs are spending all of there time.

How many physical CPUs do you have?

> [1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=683170

Hang on, this shows:
        server1:~# xm vcpu-list 0
        Name                                ID  VCPU   CPU State   Time(s) CPU
        Affinity
        Domain-0                             0     0     0   r--    1568.2 0
        Domain-0                             0     1     -   --p     129.3 0
        Domain-0                             0     2     -   --p     132.1 0
        Domain-0                             0     3     -   --p     134.8 0
        
IOW you have 4 dom0 VCPUs but they are all constrained to run on
physical CPU0 --that would lead precisely to loads of stolen time!

What pinning options are you using to achieve this? It might be useful
to provide you full command lines (both h/v and kernel) and config files
etc. A boot log wouldn't go amiss either.

Contrast with my system here:
        root@calder:~# xm vcpu-list
        Name                                ID  VCPU   CPU State   Time(s) CPU Affinity
        Domain-0                             0     0     0   -b-    1628.5 any cpu
        Domain-0                             0     1     1   r--    1539.1 any cpu

Here you see that my 2 dom0 vcpus are free to run on any pVCPU. Even
with pinning I would expect VCPU0->PCPU0 and VCPU1->PCPU1.

Ian.
-- 
Ian Campbell

Have a place for everything and keep the thing somewhere else; this is not
advice, it is merely custom.
		-- Mark Twain


Reply to: