[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#700755: huge slab_unreclaimable in Xen domU



On Wed, Feb 20, 2013 at 10:27:02AM +0000, Ian Campbell wrote:
> On Sun, 2013-02-17 at 00:22 +0100, Josip Rodin wrote:
> > Package: linux-image-2.6.32-5-xen-amd64
> 
> This is in a guest, right? Is it possible to try the non-Xen amd64
> flavour? I forget the exact status in Squeeze but IIRC most of the domU
> functionality is present in the -amd64 flavour with the -xen-amd64
> flavour only being required for dom0 and some of the more advanced domU
> features.
> 
> The reason I ask this is that the non-xen flavour is closer to mainline
> and therefore should be easier to track down the issue with.
> 
> If you are also able separately to try this with the Wheezy kernel that
> would be very useful too.

OK, I can install both (it's got PV-GRUB), which do you prefer to test first?
I'm asking because it'll likely take a few weeks for the bug to appear,
judging by what it did before.

> > The thing I noticed was the slab_unreclaimable explosion, by a factor
> > of 122. That... doesn't sound like something that should be happenning.
> 
> Indeed. Is the system responsive enough to login and
> examine /proc/slabinfo? There is probably one which has exploded in
> size, it may even be sufficient to observe this over time and see if one
> seems to be slowly creeping upwards towards $doom.
> 
> > I'm going to try to run slabtop the next time I catch it in this state,
> > in order to try to glean some more information.
> 
> That would be great.

I did post two consecutive slabtop results... I thought they had all the
relevant info from /proc/slabinfo.

The two large elements that grew both in the total number of objects and
the active number were (extracted from my previous message):

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
first readout:
 65419  65419 100%    4.00K  14179        8    453728K kmalloc-4096
 65390  65390 100%    2.06K  13338       15    426816K net_namespace
second readout:
 65428  65428 100%    4.00K  14181        8    453792K kmalloc-4096
 65391  65391 100%    2.06K  13339       15    426848K net_namespace

How do I trace which process is calling this?

In comparison, now, under seemingly normal circumstances, slabtop looks like
this on that machine:

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
 56124  25272  45%    0.11K   1559       36      6236K buffer_head
 24843  12898  51%    0.19K   1183       21      4732K dentry
 23100  16107  69%    1.01K   1540       15     24640K nfs_inode_cache
 11456   6403  55%    0.06K    179       64       716K kmalloc-64
 10208   8864  86%    0.12K    319       32      1276K kmalloc-128
  7308   5275  72%    0.55K    522       14      4176K radix_tree_node
  4947   4940  99%    0.08K     97       51       388K sysfs_dir_cache
  3584   3573  99%    0.01K      7      512        28K kmalloc-8
  3200   2016  63%    0.79K    160       20      2560K ext3_inode_cache
  2068   1981  95%    0.18K     94       22       376K vm_area_struct
  1792   1790  99%    0.02K      7      256        28K kmalloc-16
  1692   1631  96%    0.63K    141       12      1128K proc_inode_cache
  1632   1588  97%    1.00K    102       16      1632K kmalloc-1024
  1472   1442  97%    0.25K     92       16       368K kmalloc-256
  1428   1129  79%    0.19K     68       21       272K kmalloc-192
  1296   1284  99%    4.00K    162        8      5184K kmalloc-4096
  1275   1270  99%    2.06K     85       15      2720K net_namespace
[...]

-- 
     2. That which causes joy or happiness.


Reply to: