[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#914505: linux: reference to netfilter chain not removed on rule replacement, subsequently system hangs



Control: tags -1 + moreinfo

On Sat, Nov 24, 2018 at 03:48:06AM +0100, Christoph Anton Mitterer wrote:
> Source: linux
> Version: 4.18.20-1
> Severity: important
> Tags: upstream
> 
> 
> Hi.
> 
> Possibly the following may be also partially iptables (i.e. the userland tool) fault.
> 
> I'm using fail2ban with some custom usage mode, which is that the hook-rule
> for fail2ban's change isn't just appended somehwere, but an inserted at just
> the right point in my iptables rules (loaded at boot by netfilter-persistent).
> 
> This looks e.g. like the following in terms of rules:
> ...
> -A INPUT	--in-interface lo  -m comment  --comment "f2b-hook-sshd"
> -A INPUT	--destination 0.ssh.srv.localhost  --protocol tcp  -m tcp  --destination-port ssh --syn	-j ACCEPT
> ...
> (where the first rule servers as a dummy rule)
> 
> And an /etc/fail2ban/action.d/iptables-multiport.conf which looks like:
> ...
> actionstart = <iptables> -N f2b-<name>
>               <iptables> -A f2b-<name> -j <returntype>
>               rulenum="$( <iptables> -L <chain> --line-numbers  |  grep '/\* f2b-hook-<name> \*/'  |  cut -d ' ' -f 1 )"
>               <iptables> -R <chain> "${rulenum}" -p <protocol> -m multiport --dports <port> -j f2b-<name>
> ...
> actionstop = rulenum="$( <iptables> -L <chain> --line-numbers  |  grep f2b-<name>  |  cut -d ' ' -f 1 )"
>              <iptables> -R <chain> "${rulenum}" --in-interface lo -m comment --comment f2b-hook-<name>
>              <iptables> -F f2b-<name>
>              <iptables> -X f2b-<name>
> ...
> 
> So far so good.
> 
> 
> When fail2ban starts I get something like this:
> # iptables -L
> Chain INPUT (policy DROP)
> target     prot opt source               destination         
> ACCEPT     all  --  anywhere             anywhere            
> ACCEPT     all  --  anywhere             anywhere             state RELATED,ESTABLISHED
> ACCEPT     icmp --  anywhere             anywhere            
> DROP       all  --  anywhere             anywhere             state INVALID,UNTRACKED
> f2b-sshd   tcp  --  anywhere             anywhere             multiport dports ssh
> REJECT     all  --  anywhere             anywhere             reject-with icmp-port-unreachable
> ...
> Chain f2b-sshd (1 references)
> target     prot opt source               destination         
> RETURN     all  --  anywhere             anywhere            
> 
> 
> Everything fine.
> 
> 
> But now:
> 
> 
> 1) Replacing the rule that causes the reference to f2b-sshd doesn't clear the reference.
> Now when I stop fail2ban it will do something like:
> iptables -R INPUT 5 --in-interface lo -m comment --comment f2b-hook-<name>
> i.e. bringing me back the original dummy rule, but here some error happens on either
> iptable or the kernel or both:
> # iptables -L
> Chain INPUT (policy DROP)
> target     prot opt source               destination         
> ACCEPT     all  --  anywhere             anywhere            
> ACCEPT     all  --  anywhere             anywhere             state RELATED,ESTABLISHED
> ACCEPT     icmp --  anywhere             anywhere            
> DROP       all  --  anywhere             anywhere             state INVALID,UNTRACKED
>            all  --  anywhere             anywhere             /* f2b-hook-ssh */
> REJECT     all  --  anywhere             anywhere             reject-with icmp-port-unreachable
> ...
> Chain f2b-sshd (1 references)
> target     prot opt source               destination         
> 
> The dummy rule in INPUT brought back, the chain f2b-sshd is flushed but left back
> with reference set to 1, which is obviously wrong, as the rule no longer
> references the queue.
> 
> This also happens when just calling the iptables commands manually.
> It does not happen when e.g. deleting the rules (iptables -D) as fail2ban would
> do per default.
> 
> 
> If I repeat this multiple times, I can make the references even count up, e.g.:
> Chain f2b-sshd (2 references)
> target     prot opt source               destination         
> 
> 
> 
> 2) The kernel is now in state from which it cannot recover,...
> it seems.
> 
> It doesn't seem possible to be possible to get the broken chain
> away... including when I deleted the rule that was replaced (better
> said its replacement).
> 
> When I try to start from scratch with e.g.
> # iptables-restore < /etc/iptables/rules.v4
> The process hangs and I get a:
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115308] ------------[ cut here ]------------
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115320] kernel BUG at /build/linux-iActNR/linux-4.18.10/net/netfilter/nf_tables_api.c:1364!
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115367] invalid opcode: 0000 [#1] SMP PTI
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115379] CPU: 3 PID: 17642 Comm: iptables-restor Not tainted 4.18.0-2-amd64 #1 Debian 4.18.10-2
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115382] Hardware name: FUJITSU LIFEBOOK U757/FJNB2A5, BIOS Version 1.21 03/19/2018
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115412] RIP: 0010:nf_tables_chain_destroy.isra.48+0x95/0xa0 [nf_tables]
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115414] Code: 51 bf ab d8 48 8b 7b 58 e8 78 5b b3 d8 48 89 ef 5b 5d e9 6e 5b b3 d8 48 8b 7b 58 e8 65 5b b3 d8 48 89 df 5b 5d e9 5b 5b b3 d8 <0f> 0b 0f 0b eb 9c 0f 1f 44 00 00 0f 1f 44 00 00 53 48 8b 07 8b 90 
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115450] RSP: 0018:ffffa6c70aaf3998 EFLAGS: 00010202
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115454] RAX: 0000000000000001 RBX: ffffffff9a2dafc0 RCX: dead000000000200
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115456] RDX: ffff99a0758d3cc0 RSI: ffff99a18a8fa980 RDI: ffff99a22927ef00
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115457] RBP: ffff99a0758d3cc0 R08: 0000000000000000 R09: ffffffffc08ff600
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115459] R10: ffff99a18a8fae00 R11: 0000000000000001 R12: dead000000000200
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115461] R13: dead000000000100 R14: ffff99a18a8fa980 R15: ffffffff9a2dc220
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115464] FS:  00007fb5a8b28b80(0000) GS:ffff99a25dd80000(0000) knlGS:0000000000000000
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115466] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115467] CR2: 00005600467ffda4 CR3: 000000058cfa6005 CR4: 00000000003606e0
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115470] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115472] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115473] Call Trace:
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115495]  nf_tables_commit+0xd13/0x1110 [nf_tables]
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115516]  nfnetlink_rcv_batch+0x562/0x6d0 [nfnetlink]
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115538]  ? kmem_cache_alloc_node_trace+0x1b0/0x1e0
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115549]  ? alloc_vmap_area+0x7c/0x360
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115553]  ? __insert_vmap_area+0x99/0x100
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115562]  ? refcount_inc+0x5/0x30
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115571]  ? apparmor_capable+0x72/0xb0
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115580]  ? security_capable+0x35/0x50
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115587]  ? nla_parse+0x32/0x100
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115592]  nfnetlink_rcv+0x11e/0x13c [nfnetlink]
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115604]  netlink_unicast+0x1c2/0x250
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115609]  netlink_sendmsg+0x2c1/0x3b0
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115620]  sock_sendmsg+0x36/0x40
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115626]  ___sys_sendmsg+0x2a0/0x2f0
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115639]  ? filemap_map_pages+0x385/0x3a0
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115642]  ? refcount_inc+0x5/0x30
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115650]  ? apparmor_capable+0x72/0xb0
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115655]  ? security_capable+0x35/0x50
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115660]  ? __sys_sendmsg+0x5e/0xa0
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115665]  __sys_sendmsg+0x5e/0xa0
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115677]  do_syscall_64+0x55/0x110
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115688]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115696] RIP: 0033:0x7fb5a8e36354
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115697] Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 48 8d 05 91 36 0c 00 8b 00 85 c0 75 13 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 41 54 55 41 89 d4 53 48 89 f5 
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115733] RSP: 002b:00007fff63f0ea38 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115736] RAX: ffffffffffffffda RBX: 00007fff63f0ea50 RCX: 00007fb5a8e36354
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115739] RDX: 0000000000000000 RSI: 00007fff63f0fad0 RDI: 0000000000000003
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115740] RBP: 00007fff63f10150 R08: 0000000000000004 R09: 00007fb5a8af6f40
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115742] R10: 00007fff63f0fabc R11: 0000000000000246 R12: 00005628c5276740
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115744] R13: 00007fff63f12a20 R14: 00007fff63f0ea40 R15: 00007fff63f12a58
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115747] Modules linked in: udp_diag tcp_diag inet_diag nft_chain_route_ipv4 xt_CHECKSUM nft_chain_nat_ipv4 ipt_MASQUERADE nf_nat_ipv4 nf_nat tun bridge stp llc ctr ccm fuse devlink ebtable_filter ebtables cpufreq_userspace cpufreq_powersave cpufreq_conservative arc4 snd_hda_codec_hdmi iTCO_wdt iTCO_vendor_support intel_rapl snd_hda_codec_realtek nf_conntrack_ipv6 nf_defrag_ipv6 x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_generic xt_tcpudp kvm_intel iwlmvm snd_soc_skl snd_soc_skl_ipc ip6t_REJECT snd_soc_sst_ipc nf_reject_ipv6 snd_soc_sst_dsp kvm snd_hda_ext_core irqbypass snd_soc_acpi mac80211 crct10dif_pclmul snd_soc_core crc32_pclmul snd_compress btusb btrtl snd_hda_intel btbcm btintel snd_hda_codec ghash_clmulni_intel bluetooth snd_hda_core intel_cstate iwlwifi snd_hwdep uvcvideo
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115815]  intel_uncore snd_pcm videobuf2_vmalloc videobuf2_memops cdc_mbim intel_rapl_perf videobuf2_v4l2 snd_timer cdc_wdm videobuf2_common nf_conntrack_ipv4 cdc_ncm nf_defrag_ipv4 usbnet videodev mii snd pcspkr sdhci_pci cqhci joydev media drbg i915 soundcore sdhci ansi_cprng idma64 nft_counter mmc_core cfg80211 sg ecdh_generic drm_kms_helper crc16 rfkill mei_me intel_lpss_pci drm i2c_i801 intel_lpss xt_comment mei i2c_algo_bit ipt_REJECT nf_reject_ipv4 wmi button battery xt_multiport xt_policy xt_state xt_conntrack nf_conntrack nft_compat tpm_crb fujitsu_laptop tpm_tis tpm_tis_core sparse_keymap video tpm pcc_cpufreq acpi_pad ac rng_core nf_tables nfnetlink binfmt_misc loop parport_pc sunrpc ppdev lp parport ip_tables x_tables autofs4 dm_crypt dm_mod raid10 raid456 async_raid6_recov async_memcpy
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115895]  async_pq async_xor async_tx raid1 raid0 multipath linear md_mod btrfs libcrc32c crc32c_generic xor zstd_decompress zstd_compress xxhash raid6_pq uhci_hcd ehci_pci ehci_hcd usb_storage sd_mod crc32c_intel ahci libahci xhci_pci xhci_hcd aesni_intel aes_x86_64 crypto_simd libata cryptd glue_helper evdev scsi_mod psmouse serio_raw e1000e usbcore usb_common
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115967] ---[ end trace 78344f348b2da5ca ]---
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115982] RIP: 0010:nf_tables_chain_destroy.isra.48+0x95/0xa0 [nf_tables]
> Nov 24 03:00:01 heisenberg kernel: [ 9857.115984] Code: 51 bf ab d8 48 8b 7b 58 e8 78 5b b3 d8 48 89 ef 5b 5d e9 6e 5b b3 d8 48 8b 7b 58 e8 65 5b b3 d8 48 89 df 5b 5d e9 5b 5b b3 d8 <0f> 0b 0f 0b eb 9c 0f 1f 44 00 00 0f 1f 44 00 00 53 48 8b 07 8b 90 
> Nov 24 03:00:01 heisenberg kernel: [ 9857.116015] RSP: 0018:ffffa6c70aaf3998 EFLAGS: 00010202
> Nov 24 03:00:01 heisenberg kernel: [ 9857.116017] RAX: 0000000000000001 RBX: ffffffff9a2dafc0 RCX: dead000000000200
> Nov 24 03:00:01 heisenberg kernel: [ 9857.116018] RDX: ffff99a0758d3cc0 RSI: ffff99a18a8fa980 RDI: ffff99a22927ef00
> Nov 24 03:00:01 heisenberg kernel: [ 9857.116020] RBP: ffff99a0758d3cc0 R08: 0000000000000000 R09: ffffffffc08ff600
> Nov 24 03:00:01 heisenberg kernel: [ 9857.116022] R10: ffff99a18a8fae00 R11: 0000000000000001 R12: dead000000000200
> Nov 24 03:00:01 heisenberg kernel: [ 9857.116023] R13: dead000000000100 R14: ffff99a18a8fa980 R15: ffffffff9a2dc220
> Nov 24 03:00:01 heisenberg kernel: [ 9857.116025] FS:  00007fb5a8b28b80(0000) GS:ffff99a25dd80000(0000) knlGS:0000000000000000
> Nov 24 03:00:01 heisenberg kernel: [ 9857.116027] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Nov 24 03:00:01 heisenberg kernel: [ 9857.116028] CR2: 00005600467ffda4 CR3: 000000058cfa6005 CR4: 00000000003606e0
> Nov 24 03:00:01 heisenberg kernel: [ 9857.116030] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Nov 24 03:00:01 heisenberg kernel: [ 9857.116032] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> 
> 
> Further, any networking seems dead now (probably because netfilter has said goodbye).
> Cleanly rebooting also fails as systemd tries to shutdown all kinds of (now hanging)
> networking stuff (including netfilter-persistent) and waits forevery during shutdown.
> 
> 
> I'd guess this can be clearly not just an error in userland tools... or at least kernel
> shouldn't allow userland to get it into such bad state.

Is this issue still reproducible for you with a recent kernel from
unstable or buster-backports? There seem to have been a couple of
commits in this area after 4.18, which might have resolved the issue.

If it is not repoducible anymore, let's rather otherwise close the
issue.

Regards,
Salvatore


Reply to: