[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#576838: KVM: networking stack tanks after page allocation failure



On Thu, 2010-04-08 at 12:41 -0400, micah anderson wrote:
> On 2010-04-08, micah anderson wrote:
> > On Wed, 2010-04-07 at 11:52 -0400, Micah Anderson wrote:
> > > Package: linux-image-2.6.32-2-amd64
> > > Version: 2.6.32-8~bpo50+1
> > > Severity: important
> > > 
> > > I'm running a tor exit node on a kvm instance, it runs for a little
> > > while (between an hour and 3 days), doing 30-40mbit/sec and then
> > > suddenly 'swapper: page allocation failure' happens, and the entire
> > > networking stack of the kvm instance is dead. It stops responding on
> > > the net completely. No ping in or out, no traffic can be observed
> > > using tcpdump, the counters on the interface no longer change
> > > (although the interface stays up).
> > [...]
> > 
> > It sounds like there might be a memory leak.  Please send the contents
> > of /proc/meminfo and /proc/slabinfo from a 'normal' state and the broken
> > state.
> 
> I noticed this time when it crashed something different that I had not
> seen in previous 2.6.30/2.6.26 kernels:
> 
> [ 7962.841287] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
> [ 7962.841287]   cache: kmalloc-1024, object size: 1024, buffer size: 1024, default order: 1, min order: 0
> [ 7962.841287]   node 0: slabs: 606, objs: 4544, free: 0
> 
> and then the normal:
> [ 7963.102476] swapper: page allocation failure. order:0, mode:0x4020
> [ 7963.105743] Pid: 0, comm: swapper Not tainted 2.6.32-bpo.2-amd64 #1
> [ 7963.106418] Call Trace:
> [ 7963.106418]  <IRQ>  [<ffffffff810b947d>] ? __alloc_pages_nodemask+0x55b/0x5ce
> etc. 
> 
> As requested here is a normal state /proc/meminfo and /proc/slabinfo. See below for
> the broken state
[...]

There's no sign of a memory leak and there's actually much more free
memory in the broken state, perhaps because any network servers have
lost all their clients and freed session state.  My guess is that the
driver just doesn't handle allocation failure gracefully.  Which network
driver are you using in the guest?

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: