Bug#843015: Heavy swapping and OOM without visible cause
Package: linux-image-3.16.0-4-amd64
Version: 3.16.36-1+deb8u1
I am running Debian 8.6 Linux 3.16.0-4-amd64 #1 SMP Debian
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux under KVM or XEN on
several servers.
The VM is 4cpu with 4G mem, running transparent squid proxying with
netfilter redirection and some custom squid url rewrite
>From time to time system comes into memory pressure mode without any
visible reasons for it. Swap gets exausted, and OOM killler is
invoked.
Taking "ps axu" snapshot and summing all the RSS values with file
pages and slab pages does not come even close to the memory limit.
The sum of numbers from /proc/zoneinfo - anon + file + slab + free -
does not correlate with the total pages counter for both normal and
dma32 zones
On another VM, which does not show the problem atm, zone numbers are
fine and sum up nicely
/proc/meminfo:
MemTotal: 4060648 kB
MemFree: 219024 kB
MemAvailable: 188900 kB
Buffers: 1216 kB
Cached: 75408 kB
SwapCached: 7096 kB
Active: 679800 kB
Inactive: 412424 kB
Active(anon): 649008 kB
Inactive(anon): 398848 kB
Active(file): 30792 kB
Inactive(file): 13576 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 1355772 kB
SwapFree: 511084 kB
Dirty: 116 kB
Writeback: 0 kB
AnonPages: 1008660 kB
Mapped: 22724 kB
Shmem: 32256 kB
Slab: 373140 kB
SReclaimable: 64320 kB
SUnreclaim: 308820 kB
KernelStack: 2320 kB
PageTables: 8820 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 3386096 kB
Committed_AS: 2194392 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 13252 kB
VmallocChunk: 34359651740 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 61308 kB
DirectMap2M: 4132864 kB
/proc/zoneinfo:
Node 0, zone DMA
pages free 3972
min 66
low 82
high 99
scanned 0
spanned 4095
present 3998
managed 3977
nr_free_pages 3972
nr_alloc_batch 17
nr_inactive_anon 0
nr_active_anon 0
nr_inactive_file 0
nr_active_file 0
nr_unevictable 0
nr_mlock 0
nr_anon_pages 0
nr_mapped 0
nr_file_pages 0
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 1
nr_slab_unreclaimable 0
nr_page_table_pages 0
nr_kernel_stack 0
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 0
nr_dirtied 0
nr_written 0
numa_hit 107
numa_miss 0
numa_foreign 0
numa_interleave 0
numa_local 107
numa_other 0
workingset_refault 0
workingset_activate 0
workingset_nodereclaim 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 2980, 3947, 3947)
pagesets
cpu: 0
count: 0
high: 0
batch: 1
vm stats threshold: 4
cpu: 1
count: 0
high: 0
batch: 1
vm stats threshold: 4
cpu: 2
count: 0
high: 0
batch: 1
vm stats threshold: 4
cpu: 3
count: 0
high: 0
batch: 1
vm stats threshold: 4
all_unreclaimable: 1
start_pfn: 1
inactive_ratio: 1
Node 0, zone DMA32
pages free 44315
min 12710
low 15887
high 19065
scanned 0
spanned 1044480
present 782303
managed 763684
nr_free_pages 44315
nr_alloc_batch 3178
nr_inactive_anon 31650
nr_active_anon 126489
nr_inactive_file 60
nr_active_file 605
nr_unevictable 0
nr_mlock 0
nr_anon_pages 151875
nr_mapped 324
nr_file_pages 6936
nr_dirty 8
nr_writeback 0
nr_slab_reclaimable 10322
nr_slab_unreclaimable 45149
nr_page_table_pages 1198
nr_kernel_stack 80
nr_unstable 0
nr_bounce 0
nr_vmscan_write 325108
nr_vmscan_immediate_reclaim 143
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 5629
nr_dirtied 8576779
nr_written 8609273
numa_hit 7100373882
numa_miss 0
numa_foreign 0
numa_interleave 0
numa_local 7100373882
numa_other 0
workingset_refault 131027
workingset_activate 21359
workingset_nodereclaim 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 0, 966, 966)
pagesets
cpu: 0
count: 6
high: 186
batch: 31
vm stats threshold: 36
cpu: 1
count: 0
high: 186
batch: 31
vm stats threshold: 36
cpu: 2
count: 2
high: 186
batch: 31
vm stats threshold: 36
cpu: 3
count: 5
high: 186
batch: 31
vm stats threshold: 36
all_unreclaimable: 0
start_pfn: 4096
inactive_ratio: 4
Node 0, zone Normal
pages free 29512
min 4119
low 5148
high 6178
scanned 0
spanned 262144
present 262144
managed 247501
nr_free_pages 29512
nr_alloc_batch 1032
nr_inactive_anon 39602
nr_active_anon 38634
nr_inactive_file 1850
nr_active_file 5166
nr_unevictable 0
nr_mlock 0
nr_anon_pages 74337
nr_mapped 2908
nr_file_pages 10899
nr_dirty 58
nr_writeback 0
nr_slab_reclaimable 5739
nr_slab_unreclaimable 33045
nr_page_table_pages 1002
nr_kernel_stack 65
nr_unstable 0
nr_bounce 0
nr_vmscan_write 7135021
nr_vmscan_immediate_reclaim 3162844
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 3533
nr_dirtied 4402052
nr_written 7691982
numa_hit 2343477553
numa_miss 0
numa_foreign 0
numa_interleave 4615
numa_local 2343477553
numa_other 0
workingset_refault 1696598
workingset_activate 439794
workingset_nodereclaim 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 0, 0, 0)
pagesets
cpu: 0
count: 77
high: 186
batch: 31
vm stats threshold: 24
cpu: 1
count: 187
high: 186
batch: 31
vm stats threshold: 24
cpu: 2
count: 162
high: 186
batch: 31
vm stats threshold: 24
cpu: 3
count: 162
high: 186
batch: 31
vm stats threshold: 24
all_unreclaimable: 0
start_pfn: 1048576
inactive_ratio: 1
/proc/buddyinfo:
Node 0, zone DMA 2 1 2 1 3 2 2
1 2 2 2
Node 0, zone DMA32 12437 12292 1009 2 2 0 0
0 0 1 0
Node 0, zone Normal 20329 5337 0 10 3 0 0
1 0 0 0
ps axu:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 28880 1676 ? Ss Oct06 1:36 /sbin/init
root 2 0.0 0.0 0 0 ? S Oct06 0:10 [kthreadd]
root 3 0.7 0.0 0 0 ? S Oct06 312:28 [ksoftirqd/0]
root 5 0.0 0.0 0 0 ? S< Oct06 0:00 [kworker/0:0H]
root 7 0.2 0.0 0 0 ? S Oct06 100:15 [rcu_sched]
root 8 0.0 0.0 0 0 ? S Oct06 0:00 [rcu_bh]
root 9 0.0 0.0 0 0 ? S Oct06 0:06 [migration/0]
root 10 0.0 0.0 0 0 ? S Oct06 0:41 [watchdog/0]
root 11 0.0 0.0 0 0 ? S Oct06 1:42 [watchdog/1]
root 12 0.0 0.0 0 0 ? S Oct06 0:17 [migration/1]
root 13 1.7 0.0 0 0 ? S Oct06 710:32 [ksoftirqd/1]
root 15 0.0 0.0 0 0 ? S< Oct06 0:00 [kworker/1:0H]
root 16 0.0 0.0 0 0 ? S Oct06 2:06 [watchdog/2]
root 17 0.0 0.0 0 0 ? S Oct06 0:17 [migration/2]
root 18 1.7 0.0 0 0 ? S Oct06 711:36 [ksoftirqd/2]
root 20 0.0 0.0 0 0 ? S< Oct06 0:00 [kworker/2:0H]
root 21 0.0 0.0 0 0 ? S Oct06 2:15 [watchdog/3]
root 22 0.0 0.0 0 0 ? S Oct06 0:23 [migration/3]
root 23 1.8 0.0 0 0 ? S Oct06 754:45 [ksoftirqd/3]
root 25 0.0 0.0 0 0 ? S< Oct06 0:00 [kworker/3:0H]
root 26 0.0 0.0 0 0 ? S< Oct06 0:00 [khelper]
root 27 0.0 0.0 0 0 ? S Oct06 0:00 [kdevtmpfs]
root 28 0.0 0.0 0 0 ? S< Oct06 0:00 [netns]
root 29 0.0 0.0 0 0 ? S Oct06 0:04 [khungtaskd]
root 30 0.0 0.0 0 0 ? S< Oct06 0:00 [writeback]
root 31 0.0 0.0 0 0 ? SN Oct06 0:00 [ksmd]
root 32 0.0 0.0 0 0 ? SN Oct06 1:13 [khugepaged]
root 33 0.0 0.0 0 0 ? S< Oct06 0:00 [crypto]
root 34 0.0 0.0 0 0 ? S< Oct06 0:00 [kintegrityd]
root 35 0.0 0.0 0 0 ? S< Oct06 0:00 [bioset]
root 36 0.0 0.0 0 0 ? S< Oct06 0:00 [kblockd]
root 39 0.0 0.0 0 0 ? S Oct06 14:31 [kswapd0]
root 40 0.0 0.0 0 0 ? S< Oct06 0:00 [vmstat]
root 41 0.0 0.0 0 0 ? S Oct06 0:00 [fsnotify_mark]
root 47 0.0 0.0 0 0 ? S< Oct06 0:00 [kthrotld]
root 48 0.0 0.0 0 0 ? S< Oct06 0:00 [ipv6_addrconf]
root 49 0.0 0.0 0 0 ? S< Oct06 0:00 [deferwq]
root 90 0.0 0.0 0 0 ? S Oct06 0:00 [khubd]
root 91 0.0 0.0 0 0 ? S< Oct06 0:00 [ata_sff]
root 92 0.0 0.0 0 0 ? S< Oct06 0:00 [kpsmoused]
root 96 0.0 0.0 0 0 ? S Oct06 0:00 [scsi_eh_0]
root 97 0.0 0.0 0 0 ? S< Oct06 0:00 [scsi_tmf_0]
root 98 0.0 0.0 0 0 ? S Oct06 0:00 [scsi_eh_1]
root 99 0.0 0.0 0 0 ? S< Oct06 0:00 [scsi_tmf_1]
root 105 0.0 0.0 0 0 ? S< Oct06 0:09 [kworker/0:1H]
root 113 0.0 0.0 0 0 ? S< Oct06 0:00 [kdmflush]
root 114 0.0 0.0 0 0 ? S< Oct06 0:00 [bioset]
root 120 0.0 0.0 0 0 ? S< Oct06 0:00 [kdmflush]
root 121 0.0 0.0 0 0 ? S< Oct06 0:00 [bioset]
root 140 0.0 0.0 0 0 ? S< Oct06 0:10 [kworker/2:1H]
root 143 0.0 0.0 0 0 ? S Oct06 0:30 [jbd2/dm-0-8]
root 144 0.0 0.0 0 0 ? S< Oct06 0:00
[ext4-rsv-conver]
root 174 0.0 0.0 0 0 ? S Oct06 0:00 [kauditd]
root 187 0.0 0.0 41076 8 ? Ss Oct06 0:00
/lib/systemd/systemd-udevd
root 189 0.5 0.1 41532 4184 ? Ss Oct06 216:16
/lib/systemd/systemd-journald
root 219 0.0 0.0 0 0 ? S Oct06 17:56 [vballoon]
root 267 0.0 0.0 0 0 ? S< Oct06 0:00 [hd-audio0]
root 282 0.0 0.0 0 0 ? S< Oct06 0:00 [kdmflush]
root 283 0.0 0.0 0 0 ? S< Oct06 0:00 [bioset]
root 284 0.0 0.0 0 0 ? S< Oct06 0:00 [kdmflush]
root 286 0.0 0.0 0 0 ? S< Oct06 0:00 [bioset]
root 287 0.0 0.0 0 0 ? S< Oct06 0:00 [kdmflush]
root 288 0.0 0.0 0 0 ? S< Oct06 0:00 [bioset]
root 289 0.0 0.0 0 0 ? S< Oct06 0:00
[ext4-rsv-conver]
root 316 0.0 0.0 0 0 ? S Oct06 2:08 [jbd2/dm-2-8]
root 317 0.0 0.0 0 0 ? S< Oct06 0:00
[ext4-rsv-conver]
root 321 0.0 0.0 0 0 ? S< Oct06 0:09 [kworker/1:1H]
root 325 0.0 0.0 0 0 ? S Oct06 0:00 [jbd2/dm-4-8]
root 326 0.0 0.0 0 0 ? S< Oct06 0:00
[ext4-rsv-conver]
root 329 0.0 0.0 0 0 ? S Oct06 0:16 [jbd2/dm-3-8]
root 330 0.0 0.0 0 0 ? S< Oct06 0:00
[ext4-rsv-conver]
root 1062 0.0 0.0 0 0 ? S< Oct06 0:10 [kworker/3:1H]
root 1415 0.0 0.0 37080 44 ? Ss Oct06 0:29
/sbin/rpcbind -w
statd 1424 0.0 0.0 37280 12 ? Ss Oct06 0:00 /sbin/rpc.statd
root 1429 0.0 0.0 0 0 ? S< Oct06 0:00 [rpciod]
root 1432 0.0 0.0 0 0 ? S< Oct06 0:00 [nfsiod]
root 1439 0.0 0.0 23356 0 ? Ss Oct06 0:00
/usr/sbin/rpc.idmapd
vnstat 1440 0.0 0.0 7360 660 ? Ss Oct06 3:52
/usr/sbin/vnstatd -n
daemon 1445 0.0 0.0 19024 0 ? Ss Oct06 0:00
/usr/sbin/atd -f
root 1453 0.0 0.0 55184 8 ? Ss Oct06 0:00
/usr/sbin/sshd -D
root 1458 0.0 0.0 27504 588 ? Ss Oct06 1:18
/usr/sbin/cron -f
root 1460 0.0 0.0 19856 44 ? Ss Oct06 0:27
/lib/systemd/systemd-logind
message+ 1491 0.0 0.0 42124 0 ? Ss Oct06 0:00
/usr/bin/dbus-daemon --system --address=systemd: --nofork
ntp 1513 0.0 0.0 33384 1024 ? Ss Oct06 17:14
/usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 108:113
root 1517 0.0 0.0 19380 660 ? Ss Oct06 8:37
/usr/sbin/irqbalance --pid=/var/run/irqbalance.pid
root 1528 0.1 0.0 258672 968 ? Ssl Oct06 69:47
/usr/sbin/rsyslogd -n
root 1529 0.0 0.0 4256 12 ? Ss Oct06 0:00 /usr/sbin/acpid
root 1567 0.0 0.0 35300 96 ? S Oct06 6:33
/usr/bin/perl /opt/blacklist/bin/sniwrapper.pl
root 1576 0.0 0.0 14416 12 tty1 Ss+ Oct06 0:00
/sbin/agetty --noclear tty1 linux
Debian-+ 1821 0.0 0.0 53252 412 ? Ss Oct06 0:02
/usr/sbin/exim4 -bd -q30m
root 6589 0.0 0.0 79616 48 ? Ss Oct23 0:00
/usr/sbin/squid3 -YC -f /etc/squid3/squid.conf
proxy 6591 0.0 0.0 156292 3444 ? S Oct23 2:44
(squid-coord-5) -YC -f /etc/squid3/squid.conf
proxy 6592 18.0 6.5 685796 267424 ? R Oct23 2799:47
(squid-4) -YC -f /etc/squid3/squid.conf
proxy 6598 19.3 6.9 711076 283584 ? R Oct23 3008:19
(squid-1) -YC -f /etc/squid3/squid.conf
root 9511 0.0 0.0 0 0 ? S Oct31 0:00 [kworker/1:0]
bind 9513 12.9 2.0 604016 84892 ? Ssl Oct31 469:07
/usr/sbin/named -f -u bind -S 65535
root 9520 0.0 0.0 0 0 ? S Oct31 0:02 [kworker/3:2]
proxy 21634 0.0 0.0 49764 2756 ? S 06:25 0:01
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy 21639 8.9 0.1 51052 4344 ? S 06:25 16:00
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy 21647 8.7 0.1 50952 4324 ? S 06:25 15:35
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy 21656 2.3 0.1 50976 4212 ? S 06:25 4:09
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy 21663 2.2 0.1 50976 4172 ? S 06:25 4:03
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy 21667 0.7 0.1 50968 4368 ? S 06:25 1:15
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy 21682 0.6 0.1 50976 4132 ? S 06:25 1:10
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
root 21686 0.0 0.0 64824 0 ? S 06:25 0:00
/usr/sbin/zabbix_agentd
root 21688 0.0 0.0 64824 932 ? S 06:25 0:03
/usr/sbin/zabbix_agentd: collector [idle 1 sec]
root 21689 0.0 0.0 64824 1076 ? S 06:25 0:06
/usr/sbin/zabbix_agentd: listener #1 [waiting for connect
root 21690 0.0 0.0 64824 1088 ? S 06:25 0:03
/usr/sbin/zabbix_agentd: listener #2 [waiting for connect
root 21691 0.0 0.0 64824 1132 ? S 06:25 0:03
/usr/sbin/zabbix_agentd: listener #3 [waiting for connect
root 21693 0.0 0.0 64824 24 ? S 06:25 0:00
/usr/sbin/zabbix_agentd: active checks #1 [idle 1 sec]
bird 22608 0.1 0.1 23632 5680 ? Ss Oct13 40:06
/usr/sbin/bird -f -u bird -g bird
bird 22693 0.0 0.0 18464 3184 ? Ss Oct13 12:09
/usr/sbin/bird6 -f -u bird -g bird
root 25507 0.0 0.0 0 0 ? S 07:57 0:00 [kworker/u8:0]
root 25646 0.0 0.0 0 0 ? S Nov01 1:28 [kworker/0:2]
proxy 26454 27.7 2.6 310708 108432 ? R 08:21 17:23
(squid-2) -YC -f /etc/squid3/squid.conf
proxy 26455 7.6 0.1 50972 4432 ? S 08:21 4:46
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy 26456 2.6 0.1 50980 4700 ? S 08:21 1:41
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy 26457 1.3 0.1 50968 4384 ? S 08:21 0:49
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
root 26896 0.0 0.0 0 0 ? S 08:31 0:03 [kworker/3:0]
root 27321 0.0 0.0 82592 60 ? Ss 08:42 0:00 sshd:
uuu [priv]
root 27342 0.1 0.0 0 0 ? S 08:42 0:03 [kworker/0:1]
uuu 27343 0.0 0.0 82732 644 ? S 08:42 0:00 sshd: uuu@pts/0
uuu 27352 0.2 0.0 24324 3580 pts/0 Ss 08:42 0:07 -bash
root 27807 0.1 0.0 0 0 ? S 08:53 0:02 [kworker/1:2]
proxy 28014 40.7 2.7 268496 113108 ? R 08:58 10:19
(squid-3) -YC -f /etc/squid3/squid.conf
proxy 28015 11.5 0.2 50964 8148 ? R 08:58 2:54
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy 28016 2.8 0.2 50968 8128 ? S 08:58 0:42
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy 28017 0.6 0.2 50832 8172 ? S 08:58 0:09
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
root 28532 0.0 0.0 0 0 ? S 09:10 0:00 [kworker/u8:2]
root 28584 0.2 0.0 0 0 ? S 09:11 0:01 [kworker/2:1]
root 28783 0.0 0.0 0 0 ? S 09:16 0:00 [kworker/2:0]
root 28980 0.0 0.0 0 0 ? S 09:21 0:00 [kworker/u8:1]
root 28984 0.0 0.0 0 0 ? S 09:21 0:00 [kworker/2:2]
uuu 29080 0.0 0.0 19100 2328 pts/0 R+ 09:23 0:00 ps axu
free:
total used free shared buffers cached
Mem: 4060648 3695996 364652 2588 968 23380
-/+ buffers/cache: 3671648 389000
Swap: 1355772 1277624 78148
Best regards
Reply to: