Re: Slab Unreclaimable is continually growing
On Fri, Jul 26, 2019 at 08:55:19PM +0200, Matthias Böttcher wrote:
> Reco <recoverym4n@enotuniq.net>:
> >
> > Hi.
> >
> > On Wed, Jul 24, 2019 at 06:54:42PM +0200, Matthias Böttcher wrote:
> > > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
> > > 307534 304741 99% 0,20K 16186 19 64744K vm_area_struct
> > > 14280 14274 99% 3,69K 1785 8 57120K task_struct
> > > 178048 152224 85% 0,25K 11128 16 44512K filp
> > > 8536 8536 100% 4,00K 1067 8 34144K kmalloc-4096
> > > 14640 14640 100% 2,06K 976 15 31232K sighand_cache
> > >
> > > How can I detect what is eating up my memory in SUnreclaim (slab unreclaimable)?
> >
> > You did it already.
> > "vm_area_struct" is a kernel structure for anonymous memory allocations.
> > "task_struct" is a kernel structure for maintaining process execution.
> > "filp" is a kernel structure for virtual memory.
> >
> > My guess is - a small number of processes that constantly allocate
> > memory in small numbers by executing brk(2) or its modern equivalents.
> >
> > Or a relatively large number of short-lived processes.
> >
> >
> > I'd start with "pidstat -rl 1 10".
>
> Now with an uptime of 2 days, 9 hours all counters of slabtop are
> growing no more.
>
> $ slabtop --sort c --once | head -n12
> Active / Total Objects (% used) : 10336938 / 10484768 (98,6%)
> Active / Total Slabs (% used) : 328327 / 328327 (100,0%)
> Active / Total Caches (% used) : 98 / 124 (79,0%)
> Active / Total Size (% used) : 2443615,58K / 2479644,41K (98,5%)
> Minimum / Average / Maximum Object : 0,01K / 0,24K / 8,00K
>
> OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
> 308520 307456 99% 1,05K 20568 15 329088K ext4_inode_cache
> 1285388 1269730 98% 0,20K 67652 19 270608K vm_area_struct
> 61712 61692 99% 3,69K 7714 8 246848K task_struct
> 765472 649628 84% 0,25K 47842 16 191368K filp
> 36088 36083 99% 4,00K 4511 8 144352K kmalloc-4096
"ext4_inode_cache" is the usual top consumer of SLAB. I kind of
surprised that "dentry" did not make it to the top, but that can be
attributed to the filesystem usage on that server.
> What I saw with "pidstat -rl 1 10" was systemd-journald and nmbd, so I did:
>
> sudo apt purge samba # Samba was not needed
Possible, but unlikely. Samba is popular, such abnormal memory
allocations would be widely known.
> sudo systemctl stop systemd-journald-dev-log.socket \
> systemd-journald-audit.socket systemd-journald.socket
It's redundant if you have rsyslog anyway, but again, it's a popular
software, such things would be noticed. Unless of course, you have
something that's spamming audit records - some Apparmor profile in
complain state.
> and additionally I stopped the socket for the Check_MK agent:
>
> sudo systemctl stop check_mk.socket
I do not know this one. What's its purpose? Monitoring, backup,
something else? Is there a source available.
On a side note, you seem to have enough used memory to consider turning
on transparent hugepages. It should help with the size of
"vm_area_struct" and "filp".
Reco
Reply to: