[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#843015: Heavy swapping and OOM without visible cause



Package: linux-image-3.16.0-4-amd64
Version: 3.16.36-1+deb8u1

I am running Debian 8.6 Linux 3.16.0-4-amd64 #1 SMP Debian
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux under KVM or XEN on
several servers.
The VM is 4cpu with 4G mem, running transparent squid proxying with
netfilter redirection and some custom squid url rewrite

>From time to time system comes into memory pressure mode without any
visible reasons for it. Swap gets exausted, and OOM killler is
invoked.
Taking "ps axu" snapshot and summing all the RSS values with file
pages and slab pages does not come even close to the memory limit.
The sum of numbers from /proc/zoneinfo - anon + file + slab + free -
does not correlate with the total pages counter for both normal and
dma32 zones

On another VM, which does not show the problem atm, zone numbers are
fine and sum up nicely


/proc/meminfo:
MemTotal:        4060648 kB
MemFree:          219024 kB
MemAvailable:     188900 kB
Buffers:            1216 kB
Cached:            75408 kB
SwapCached:         7096 kB
Active:           679800 kB
Inactive:         412424 kB
Active(anon):     649008 kB
Inactive(anon):   398848 kB
Active(file):      30792 kB
Inactive(file):    13576 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:       1355772 kB
SwapFree:         511084 kB
Dirty:               116 kB
Writeback:             0 kB
AnonPages:       1008660 kB
Mapped:            22724 kB
Shmem:             32256 kB
Slab:             373140 kB
SReclaimable:      64320 kB
SUnreclaim:       308820 kB
KernelStack:        2320 kB
PageTables:         8820 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     3386096 kB
Committed_AS:    2194392 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       13252 kB
VmallocChunk:   34359651740 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       61308 kB
DirectMap2M:     4132864 kB


/proc/zoneinfo:
Node 0, zone      DMA
  pages free     3972
        min      66
        low      82
        high     99
        scanned  0
        spanned  4095
        present  3998
        managed  3977
    nr_free_pages 3972
    nr_alloc_batch 17
    nr_inactive_anon 0
    nr_active_anon 0
    nr_inactive_file 0
    nr_active_file 0
    nr_unevictable 0
    nr_mlock     0
    nr_anon_pages 0
    nr_mapped    0
    nr_file_pages 0
    nr_dirty     0
    nr_writeback 0
    nr_slab_reclaimable 1
    nr_slab_unreclaimable 0
    nr_page_table_pages 0
    nr_kernel_stack 0
    nr_unstable  0
    nr_bounce    0
    nr_vmscan_write 0
    nr_vmscan_immediate_reclaim 0
    nr_writeback_temp 0
    nr_isolated_anon 0
    nr_isolated_file 0
    nr_shmem     0
    nr_dirtied   0
    nr_written   0
    numa_hit     107
    numa_miss    0
    numa_foreign 0
    numa_interleave 0
    numa_local   107
    numa_other   0
    workingset_refault 0
    workingset_activate 0
    workingset_nodereclaim 0
    nr_anon_transparent_hugepages 0
    nr_free_cma  0
        protection: (0, 2980, 3947, 3947)
  pagesets
    cpu: 0
              count: 0
              high:  0
              batch: 1
  vm stats threshold: 4
    cpu: 1
              count: 0
              high:  0
              batch: 1
  vm stats threshold: 4
    cpu: 2
              count: 0
              high:  0
              batch: 1
  vm stats threshold: 4
    cpu: 3
              count: 0
              high:  0
              batch: 1
  vm stats threshold: 4
  all_unreclaimable: 1
  start_pfn:         1
  inactive_ratio:    1
Node 0, zone    DMA32
  pages free     44315
        min      12710
        low      15887
        high     19065
        scanned  0
        spanned  1044480
        present  782303
        managed  763684
    nr_free_pages 44315
    nr_alloc_batch 3178
    nr_inactive_anon 31650
    nr_active_anon 126489
    nr_inactive_file 60
    nr_active_file 605
    nr_unevictable 0
    nr_mlock     0
    nr_anon_pages 151875
    nr_mapped    324
    nr_file_pages 6936
    nr_dirty     8
    nr_writeback 0
    nr_slab_reclaimable 10322
    nr_slab_unreclaimable 45149
    nr_page_table_pages 1198
    nr_kernel_stack 80
    nr_unstable  0
    nr_bounce    0
    nr_vmscan_write 325108
    nr_vmscan_immediate_reclaim 143
    nr_writeback_temp 0
    nr_isolated_anon 0
    nr_isolated_file 0
    nr_shmem     5629
    nr_dirtied   8576779
    nr_written   8609273
    numa_hit     7100373882
    numa_miss    0
    numa_foreign 0
    numa_interleave 0
    numa_local   7100373882
    numa_other   0
    workingset_refault 131027
    workingset_activate 21359
    workingset_nodereclaim 0
    nr_anon_transparent_hugepages 0
    nr_free_cma  0
        protection: (0, 0, 966, 966)
  pagesets
    cpu: 0
              count: 6
              high:  186
              batch: 31
  vm stats threshold: 36
    cpu: 1
              count: 0
              high:  186
              batch: 31
  vm stats threshold: 36
    cpu: 2
              count: 2
              high:  186
              batch: 31
  vm stats threshold: 36
    cpu: 3
              count: 5
              high:  186
              batch: 31
  vm stats threshold: 36
  all_unreclaimable: 0
  start_pfn:         4096
  inactive_ratio:    4
Node 0, zone   Normal
  pages free     29512
        min      4119
        low      5148
        high     6178
        scanned  0
        spanned  262144
        present  262144
        managed  247501
    nr_free_pages 29512
    nr_alloc_batch 1032
    nr_inactive_anon 39602
    nr_active_anon 38634
    nr_inactive_file 1850
    nr_active_file 5166
    nr_unevictable 0
    nr_mlock     0
    nr_anon_pages 74337
    nr_mapped    2908
    nr_file_pages 10899
    nr_dirty     58
    nr_writeback 0
    nr_slab_reclaimable 5739
    nr_slab_unreclaimable 33045
    nr_page_table_pages 1002
    nr_kernel_stack 65
    nr_unstable  0
    nr_bounce    0
    nr_vmscan_write 7135021
    nr_vmscan_immediate_reclaim 3162844
    nr_writeback_temp 0
    nr_isolated_anon 0
    nr_isolated_file 0
    nr_shmem     3533
    nr_dirtied   4402052
    nr_written   7691982
    numa_hit     2343477553
    numa_miss    0
    numa_foreign 0
    numa_interleave 4615
    numa_local   2343477553
    numa_other   0
    workingset_refault 1696598
    workingset_activate 439794
    workingset_nodereclaim 0
    nr_anon_transparent_hugepages 0
    nr_free_cma  0
        protection: (0, 0, 0, 0)
  pagesets
    cpu: 0
              count: 77
              high:  186
              batch: 31
  vm stats threshold: 24
    cpu: 1
              count: 187
              high:  186
              batch: 31
  vm stats threshold: 24
    cpu: 2
              count: 162
              high:  186
              batch: 31
  vm stats threshold: 24
    cpu: 3
              count: 162
              high:  186
              batch: 31
  vm stats threshold: 24
  all_unreclaimable: 0
  start_pfn:         1048576
  inactive_ratio:    1


/proc/buddyinfo:
Node 0, zone      DMA      2      1      2      1      3      2      2
     1      2      2      2
Node 0, zone    DMA32  12437  12292   1009      2      2      0      0
     0      0      1      0
Node 0, zone   Normal  20329   5337      0     10      3      0      0
     1      0      0      0


ps axu:
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0  28880  1676 ?        Ss   Oct06   1:36 /sbin/init
root         2  0.0  0.0      0     0 ?        S    Oct06   0:10 [kthreadd]
root         3  0.7  0.0      0     0 ?        S    Oct06 312:28 [ksoftirqd/0]
root         5  0.0  0.0      0     0 ?        S<   Oct06   0:00 [kworker/0:0H]
root         7  0.2  0.0      0     0 ?        S    Oct06 100:15 [rcu_sched]
root         8  0.0  0.0      0     0 ?        S    Oct06   0:00 [rcu_bh]
root         9  0.0  0.0      0     0 ?        S    Oct06   0:06 [migration/0]
root        10  0.0  0.0      0     0 ?        S    Oct06   0:41 [watchdog/0]
root        11  0.0  0.0      0     0 ?        S    Oct06   1:42 [watchdog/1]
root        12  0.0  0.0      0     0 ?        S    Oct06   0:17 [migration/1]
root        13  1.7  0.0      0     0 ?        S    Oct06 710:32 [ksoftirqd/1]
root        15  0.0  0.0      0     0 ?        S<   Oct06   0:00 [kworker/1:0H]
root        16  0.0  0.0      0     0 ?        S    Oct06   2:06 [watchdog/2]
root        17  0.0  0.0      0     0 ?        S    Oct06   0:17 [migration/2]
root        18  1.7  0.0      0     0 ?        S    Oct06 711:36 [ksoftirqd/2]
root        20  0.0  0.0      0     0 ?        S<   Oct06   0:00 [kworker/2:0H]
root        21  0.0  0.0      0     0 ?        S    Oct06   2:15 [watchdog/3]
root        22  0.0  0.0      0     0 ?        S    Oct06   0:23 [migration/3]
root        23  1.8  0.0      0     0 ?        S    Oct06 754:45 [ksoftirqd/3]
root        25  0.0  0.0      0     0 ?        S<   Oct06   0:00 [kworker/3:0H]
root        26  0.0  0.0      0     0 ?        S<   Oct06   0:00 [khelper]
root        27  0.0  0.0      0     0 ?        S    Oct06   0:00 [kdevtmpfs]
root        28  0.0  0.0      0     0 ?        S<   Oct06   0:00 [netns]
root        29  0.0  0.0      0     0 ?        S    Oct06   0:04 [khungtaskd]
root        30  0.0  0.0      0     0 ?        S<   Oct06   0:00 [writeback]
root        31  0.0  0.0      0     0 ?        SN   Oct06   0:00 [ksmd]
root        32  0.0  0.0      0     0 ?        SN   Oct06   1:13 [khugepaged]
root        33  0.0  0.0      0     0 ?        S<   Oct06   0:00 [crypto]
root        34  0.0  0.0      0     0 ?        S<   Oct06   0:00 [kintegrityd]
root        35  0.0  0.0      0     0 ?        S<   Oct06   0:00 [bioset]
root        36  0.0  0.0      0     0 ?        S<   Oct06   0:00 [kblockd]
root        39  0.0  0.0      0     0 ?        S    Oct06  14:31 [kswapd0]
root        40  0.0  0.0      0     0 ?        S<   Oct06   0:00 [vmstat]
root        41  0.0  0.0      0     0 ?        S    Oct06   0:00 [fsnotify_mark]
root        47  0.0  0.0      0     0 ?        S<   Oct06   0:00 [kthrotld]
root        48  0.0  0.0      0     0 ?        S<   Oct06   0:00 [ipv6_addrconf]
root        49  0.0  0.0      0     0 ?        S<   Oct06   0:00 [deferwq]
root        90  0.0  0.0      0     0 ?        S    Oct06   0:00 [khubd]
root        91  0.0  0.0      0     0 ?        S<   Oct06   0:00 [ata_sff]
root        92  0.0  0.0      0     0 ?        S<   Oct06   0:00 [kpsmoused]
root        96  0.0  0.0      0     0 ?        S    Oct06   0:00 [scsi_eh_0]
root        97  0.0  0.0      0     0 ?        S<   Oct06   0:00 [scsi_tmf_0]
root        98  0.0  0.0      0     0 ?        S    Oct06   0:00 [scsi_eh_1]
root        99  0.0  0.0      0     0 ?        S<   Oct06   0:00 [scsi_tmf_1]
root       105  0.0  0.0      0     0 ?        S<   Oct06   0:09 [kworker/0:1H]
root       113  0.0  0.0      0     0 ?        S<   Oct06   0:00 [kdmflush]
root       114  0.0  0.0      0     0 ?        S<   Oct06   0:00 [bioset]
root       120  0.0  0.0      0     0 ?        S<   Oct06   0:00 [kdmflush]
root       121  0.0  0.0      0     0 ?        S<   Oct06   0:00 [bioset]
root       140  0.0  0.0      0     0 ?        S<   Oct06   0:10 [kworker/2:1H]
root       143  0.0  0.0      0     0 ?        S    Oct06   0:30 [jbd2/dm-0-8]
root       144  0.0  0.0      0     0 ?        S<   Oct06   0:00
[ext4-rsv-conver]
root       174  0.0  0.0      0     0 ?        S    Oct06   0:00 [kauditd]
root       187  0.0  0.0  41076     8 ?        Ss   Oct06   0:00
/lib/systemd/systemd-udevd
root       189  0.5  0.1  41532  4184 ?        Ss   Oct06 216:16
/lib/systemd/systemd-journald
root       219  0.0  0.0      0     0 ?        S    Oct06  17:56 [vballoon]
root       267  0.0  0.0      0     0 ?        S<   Oct06   0:00 [hd-audio0]
root       282  0.0  0.0      0     0 ?        S<   Oct06   0:00 [kdmflush]
root       283  0.0  0.0      0     0 ?        S<   Oct06   0:00 [bioset]
root       284  0.0  0.0      0     0 ?        S<   Oct06   0:00 [kdmflush]
root       286  0.0  0.0      0     0 ?        S<   Oct06   0:00 [bioset]
root       287  0.0  0.0      0     0 ?        S<   Oct06   0:00 [kdmflush]
root       288  0.0  0.0      0     0 ?        S<   Oct06   0:00 [bioset]
root       289  0.0  0.0      0     0 ?        S<   Oct06   0:00
[ext4-rsv-conver]
root       316  0.0  0.0      0     0 ?        S    Oct06   2:08 [jbd2/dm-2-8]
root       317  0.0  0.0      0     0 ?        S<   Oct06   0:00
[ext4-rsv-conver]
root       321  0.0  0.0      0     0 ?        S<   Oct06   0:09 [kworker/1:1H]
root       325  0.0  0.0      0     0 ?        S    Oct06   0:00 [jbd2/dm-4-8]
root       326  0.0  0.0      0     0 ?        S<   Oct06   0:00
[ext4-rsv-conver]
root       329  0.0  0.0      0     0 ?        S    Oct06   0:16 [jbd2/dm-3-8]
root       330  0.0  0.0      0     0 ?        S<   Oct06   0:00
[ext4-rsv-conver]
root      1062  0.0  0.0      0     0 ?        S<   Oct06   0:10 [kworker/3:1H]
root      1415  0.0  0.0  37080    44 ?        Ss   Oct06   0:29
/sbin/rpcbind -w
statd     1424  0.0  0.0  37280    12 ?        Ss   Oct06   0:00 /sbin/rpc.statd
root      1429  0.0  0.0      0     0 ?        S<   Oct06   0:00 [rpciod]
root      1432  0.0  0.0      0     0 ?        S<   Oct06   0:00 [nfsiod]
root      1439  0.0  0.0  23356     0 ?        Ss   Oct06   0:00
/usr/sbin/rpc.idmapd
vnstat    1440  0.0  0.0   7360   660 ?        Ss   Oct06   3:52
/usr/sbin/vnstatd -n
daemon    1445  0.0  0.0  19024     0 ?        Ss   Oct06   0:00
/usr/sbin/atd -f
root      1453  0.0  0.0  55184     8 ?        Ss   Oct06   0:00
/usr/sbin/sshd -D
root      1458  0.0  0.0  27504   588 ?        Ss   Oct06   1:18
/usr/sbin/cron -f
root      1460  0.0  0.0  19856    44 ?        Ss   Oct06   0:27
/lib/systemd/systemd-logind
message+  1491  0.0  0.0  42124     0 ?        Ss   Oct06   0:00
/usr/bin/dbus-daemon --system --address=systemd: --nofork
ntp       1513  0.0  0.0  33384  1024 ?        Ss   Oct06  17:14
/usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 108:113
root      1517  0.0  0.0  19380   660 ?        Ss   Oct06   8:37
/usr/sbin/irqbalance --pid=/var/run/irqbalance.pid
root      1528  0.1  0.0 258672   968 ?        Ssl  Oct06  69:47
/usr/sbin/rsyslogd -n
root      1529  0.0  0.0   4256    12 ?        Ss   Oct06   0:00 /usr/sbin/acpid
root      1567  0.0  0.0  35300    96 ?        S    Oct06   6:33
/usr/bin/perl /opt/blacklist/bin/sniwrapper.pl
root      1576  0.0  0.0  14416    12 tty1     Ss+  Oct06   0:00
/sbin/agetty --noclear tty1 linux
Debian-+  1821  0.0  0.0  53252   412 ?        Ss   Oct06   0:02
/usr/sbin/exim4 -bd -q30m
root      6589  0.0  0.0  79616    48 ?        Ss   Oct23   0:00
/usr/sbin/squid3 -YC -f /etc/squid3/squid.conf
proxy     6591  0.0  0.0 156292  3444 ?        S    Oct23   2:44
(squid-coord-5) -YC -f /etc/squid3/squid.conf
proxy     6592 18.0  6.5 685796 267424 ?       R    Oct23 2799:47
(squid-4) -YC -f /etc/squid3/squid.conf
proxy     6598 19.3  6.9 711076 283584 ?       R    Oct23 3008:19
(squid-1) -YC -f /etc/squid3/squid.conf
root      9511  0.0  0.0      0     0 ?        S    Oct31   0:00 [kworker/1:0]
bind      9513 12.9  2.0 604016 84892 ?        Ssl  Oct31 469:07
/usr/sbin/named -f -u bind -S 65535
root      9520  0.0  0.0      0     0 ?        S    Oct31   0:02 [kworker/3:2]
proxy    21634  0.0  0.0  49764  2756 ?        S    06:25   0:01
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy    21639  8.9  0.1  51052  4344 ?        S    06:25  16:00
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy    21647  8.7  0.1  50952  4324 ?        S    06:25  15:35
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy    21656  2.3  0.1  50976  4212 ?        S    06:25   4:09
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy    21663  2.2  0.1  50976  4172 ?        S    06:25   4:03
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy    21667  0.7  0.1  50968  4368 ?        S    06:25   1:15
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy    21682  0.6  0.1  50976  4132 ?        S    06:25   1:10
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
root     21686  0.0  0.0  64824     0 ?        S    06:25   0:00
/usr/sbin/zabbix_agentd
root     21688  0.0  0.0  64824   932 ?        S    06:25   0:03
/usr/sbin/zabbix_agentd: collector [idle 1 sec]
root     21689  0.0  0.0  64824  1076 ?        S    06:25   0:06
/usr/sbin/zabbix_agentd: listener #1 [waiting for connect
root     21690  0.0  0.0  64824  1088 ?        S    06:25   0:03
/usr/sbin/zabbix_agentd: listener #2 [waiting for connect
root     21691  0.0  0.0  64824  1132 ?        S    06:25   0:03
/usr/sbin/zabbix_agentd: listener #3 [waiting for connect
root     21693  0.0  0.0  64824    24 ?        S    06:25   0:00
/usr/sbin/zabbix_agentd: active checks #1 [idle 1 sec]
bird     22608  0.1  0.1  23632  5680 ?        Ss   Oct13  40:06
/usr/sbin/bird -f -u bird -g bird
bird     22693  0.0  0.0  18464  3184 ?        Ss   Oct13  12:09
/usr/sbin/bird6 -f -u bird -g bird
root     25507  0.0  0.0      0     0 ?        S    07:57   0:00 [kworker/u8:0]
root     25646  0.0  0.0      0     0 ?        S    Nov01   1:28 [kworker/0:2]
proxy    26454 27.7  2.6 310708 108432 ?       R    08:21  17:23
(squid-2) -YC -f /etc/squid3/squid.conf
proxy    26455  7.6  0.1  50972  4432 ?        S    08:21   4:46
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy    26456  2.6  0.1  50980  4700 ?        S    08:21   1:41
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy    26457  1.3  0.1  50968  4384 ?        S    08:21   0:49
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
root     26896  0.0  0.0      0     0 ?        S    08:31   0:03 [kworker/3:0]
root     27321  0.0  0.0  82592    60 ?        Ss   08:42   0:00 sshd:
uuu [priv]
root     27342  0.1  0.0      0     0 ?        S    08:42   0:03 [kworker/0:1]
uuu   27343  0.0  0.0  82732   644 ?        S    08:42   0:00 sshd: uuu@pts/0
uuu   27352  0.2  0.0  24324  3580 pts/0    Ss   08:42   0:07 -bash
root     27807  0.1  0.0      0     0 ?        S    08:53   0:02 [kworker/1:2]
proxy    28014 40.7  2.7 268496 113108 ?       R    08:58  10:19
(squid-3) -YC -f /etc/squid3/squid.conf
proxy    28015 11.5  0.2  50964  8148 ?        R    08:58   2:54
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy    28016  2.8  0.2  50968  8128 ?        S    08:58   0:42
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
proxy    28017  0.6  0.2  50832  8172 ?        S    08:58   0:09
/usr/bin/perl -w /opt/blacklist/bin/url_rewrite.pl
root     28532  0.0  0.0      0     0 ?        S    09:10   0:00 [kworker/u8:2]
root     28584  0.2  0.0      0     0 ?        S    09:11   0:01 [kworker/2:1]
root     28783  0.0  0.0      0     0 ?        S    09:16   0:00 [kworker/2:0]
root     28980  0.0  0.0      0     0 ?        S    09:21   0:00 [kworker/u8:1]
root     28984  0.0  0.0      0     0 ?        S    09:21   0:00 [kworker/2:2]
uuu   29080  0.0  0.0  19100  2328 pts/0    R+   09:23   0:00 ps axu


free:
             total       used       free     shared    buffers     cached
Mem:       4060648    3695996     364652       2588        968      23380
-/+ buffers/cache:    3671648     389000
Swap:      1355772    1277624      78148


Best regards


Reply to: