[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#784688: Thousands of "xen:balloon: Cannot add additional memory (-17) messages" despite dom0 ballooning disabled



On Tue, 2016-01-26 at 19:46 +0200, KSB wrote:
> > This is actually useful, because it shows that the issue occurs even
> > with
> > Xen 4.6, which I think rules out a Xen side issue (otherwise we'd have
> > had
> > lots more reports from 4.4 through to 4.6) and points to a kernel side
> > issue somewhere.
> > 
> > > But I checked logs more thoroughly and found it even on more recent
> > > kernels:
> > > 1) Lot of messages on 3.14-2-amd64 with xen-4.6, 13 domU's.
> > 
> > Just to be clear, "Lots" here means "hundreds or thousands"? I think it
> > is
> > expected to see one or two around the time a VM is started or stopped,
> > so
> > with 13 domUs a couple of dozen messages wouldn't seem out of line to
> > me.
> > 
> pkg 3.14.15-2
> ~1600 from last dmesg cleanup which was 23h ago, but all of them 
> distributed in last 15h
> 
> 
> > > 2) 4.3.0-1-amd64 xen-4.6, only two messages shortly after boot, only
> > > 1
> > > domU running:
> > > [   12.473778] xen:balloon: Cannot add additional memory (-17)
> > > [   21.673298] xen:balloon: Cannot add additional memory (-17)
> > > uptime 17 days.
> > > 
> > > Previous on same machine was 4.2.0-1-amd64 with more (-17)'s
> > 
> > Was it running xen-4.6 when it was running 4.2.0 or was that also
> > older?
> 
> 4.3.3-5 xen-4.6.0 and previous 4.2.6-1 xen-4.4.1

Thanks. And just to clarify, with Linux 4.2.6-1 xen-4.4.1 you were or were
not seeing this issue?

To summarise what I can tell from this bug log the following combinations
are/are not prone to this issue:

                        xen-??? xen-4.1 xen-4.4.1 4.4.1-9+deb8u3 xen-4.6.0
3.14.15-2                                                        Y[1]

3.16.7-ckt7-1                   N[1]
3.16.7-ckt9-3~deb8u1    Y[2]
3.16.7-ckt20-1+deb8u2                             Y[3]

4.2.6-1                                 ?[1]
4.3.3-5                                                          NN[1]
4.3.3-7                                                          N[1]

[1] KSV
[2] ML (original report, Xen version unknown)
[3] AS (with dom0_mem=1024M,max:1024M, but not dom0_mem=1024M)

The N for xen-4.1 + linux-3.16.7-ckt7-1 (KSV's #4) seems anomalous. Perhaps that version is susceptible but not exhibiting it during the span of the logs.

The ? for xen-4.4.1 + linux-4.2.6-1 is the "just to clarify" above.

In any case it does appear to correlate with the Linux version and not the Xen version, and it does appear to be fixed in 4.3.3-5, or possibly even 4.2.6-1.

I'm still unable to spot what might have changed between 3.16.7-ckt20-1+deb8u2 and 4.3.3-5 though to explain it going away, which I'd still quite liketo get to the bottom of in order to fix in Jessie.

Thanks,


Ian.


Reply to: