[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#639453: marked as done (linux-image-2.6.39-bpo.2-amd64: Dropped Connections and "Failed to create cgroup nnnn: -17" Kernel Message When vsftpd Spawning a New Process)



Your message dated Mon, 29 Aug 2011 00:05:37 +0100
with message-id <1314572769.3092.3.camel@deadeye>
and subject line Re: Bug#639453: linux-image-2.6.39-bpo.2-amd64: Dropped Connections and "Failed to create cgroup nnnn: -17" Kernel Message When vsftpd Spawning a New Process
has caused the Debian Bug report #639453,
regarding linux-image-2.6.39-bpo.2-amd64: Dropped Connections and "Failed to create cgroup nnnn: -17" Kernel Message When vsftpd Spawning a New Process
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
639453: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=639453
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: linux-2.6
Version: 2.6.39-3~bpo60+1
Severity: normal


I am experiencing the following issue with a Debian squeeze based server and the most recent squeeze-backports kernel:

I realized that some vstfpd daemons randomly drop connections (sending a FIN right after the initial TCP hand shake was completed). Furthermore, a "Failed to create cgroup nnnn: -17" message is logged by the kernel.

Furthermore, I am observing a steadily increasing number of directories named like pids being created in the root of the cgroup virtual filesystem (mounted at /cgroup). For each connection attempt to a vsftpd daemon a new directory is created. Those directories seem to be never deleted. After a few days of uptime there are about 7,500 directories while there constantly are only about 150 processes running (more or less idling, this server usually has low load).

When stracing vsftpd the call that fails seems to be this one (full output below):

clone(child_stack=0, flags=0x28000000|SIGCHLD) = -1 EEXIST (File exists)

Which makes me believe that those "zombie directories" in /cgroup might conflict with the new pid . The longer the server is up the more likely it becomes that connections are dropped.

Side note: The affected vsftpd daemons are running on a server that also is hosting an LXC-based virtual server. I have experienced a steadily increasing soft IRQ load on the server while a cgroup virtual filesystem being mounted. I have upgraded to the recent squeeze-backports kernel which seems not to suffer from this soft IRQ issue. vsftpd daemons running inside LXC containers do not drop connections.

Below is some information I thought might be useful. If required, I will gladly provide any additional information.



I am using vsftpd 2.3.2-3 which AFAIK is the most recent version available from the squeeze and squeeze-backports repositories.

"strace vsftpd /etc/vsftp.conf" output (successful connection attempt):
alarm(1)                                = 0
rt_sigreturn(0x1)                       = -1 EINTR (Interrupted system call)
alarm(0)                                = 1
wait4(-1, NULL, WNOHANG, NULL)          = 6385
wait4(-1, NULL, WNOHANG, NULL)          = -1 ECHILD (No child processes)
accept(3, {sa_family=AF_INET, sin_port=htons(46631), sin_addr=inet_addr("xxx.xxx.xxx.xxx")}, [16]) = 4
clone(child_stack=0, flags=0x28000000|SIGCHLD) = 6387
close(4)                                = 0
accept(3, 0x7fffc3ecdf70, [28])         = ? ERESTARTSYS (To be restarted)

"strace vsftpd /etc/vsftp.conf" output (failed connection attempt):
alarm(1)                                = 0
rt_sigreturn(0x1)                       = -1 EINTR (Interrupted system call)
alarm(0)                                = 1
wait4(-1, NULL, WNOHANG, NULL)          = 6387
wait4(-1, NULL, WNOHANG, NULL)          = -1 ECHILD (No child processes)
accept(3, {sa_family=AF_INET, sin_port=htons(47917), sin_addr=inet_addr("xxx.xxx.xxx.xxx")}, [16]) = 4
clone(child_stack=0, flags=0x28000000|SIGCHLD) = -1 EEXIST (File exists)
close(4)                                = 0
accept(33, 0x7fffc3ecdf70, [28])         = ? ERESTARTSYS (To be restarted) 



-- Package-specific info:
** Version:
Linux version 2.6.39-bpo.2-amd64 (Debian 2.6.39-3~bpo60+1) (norbert@tretkowski.de) (gcc version 4.4.5 (Debian 4.4.5-8) ) #1 SMP Tue Jul 26 10:35:23 UTC 2011

** Command line:
BOOT_IMAGE=/vmlinuz-2.6.39-bpo.2-amd64 root=/dev/mapper/VG1-LV_ROOT ro quiet

** Not tainted

** Model information
sys_vendor: HP
product_name: ProLiant DL380 G6
product_version: 
chassis_vendor: HP
chassis_version: 
bios_vendor: HP
bios_version: P62

** Loaded modules:
Module                  Size  Used by
ipt_LOG                12605  6 
xt_time                12459  0 
xt_connlimit           12554  0 
xt_helper              12507  0 
xt_realm               12423  0 
xt_NFQUEUE             12544  0 
xt_tcpmss              12425  0 
xt_addrtype            12557  3 
xt_pkttype             12427  0 
xt_TPROXY              12806  0 
nf_tproxy_core         12404  1 xt_TPROXY,[permanent]
ip6_tables             21907  1 xt_TPROXY
nf_defrag_ipv6         12757  1 xt_TPROXY
xt_CLASSIFY            12429  0 
xt_mark                12453  1 
xt_hashlimit           13124  0 
xt_comment             12427  24 
ipt_REJECT             12465  4 
xt_length              12460  0 
xt_connmark            12637  0 
xt_owner               12423  0 
xt_recent              13006  0 
xt_iprange             12504  0 
xt_policy              12506  0 
xt_conntrack           12639  11 
iptable_raw            12524  0 
veth                   13170  0 
xt_multiport           12518  6 
xt_state               12503  8 
xt_tcpudp              12527  46 
xt_physdev             12468  2 
iptable_mangle         12536  1 
iptable_nat            12928  0 
nf_nat                 18045  1 iptable_nat
nf_conntrack_ipv4      18081  22 iptable_nat,nf_nat
nf_conntrack           56001  8 xt_connlimit,xt_helper,xt_connmark,xt_conntrack,xt_state,iptable_nat,nf_nat,nf_conntrack_ipv4
nf_defrag_ipv4         12483  2 xt_TPROXY,nf_conntrack_ipv4
iptable_filter         12536  2 
ip_tables              21818  4 iptable_raw,iptable_mangle,iptable_nat,iptable_filter
x_tables               18886  32 ipt_LOG,xt_time,xt_connlimit,xt_helper,xt_realm,xt_NFQUEUE,xt_tcpmss,xt_addrtype,xt_pkttype,xt_TPROXY,ip6_tables,xt_CLASSIFY,xt_mark,xt_hashlimit,xt_comment,ipt_REJECT,xt_length,xt_connmark,xt_owner,xt_recent,xt_iprange,xt_policy,xt_conntrack,iptable_raw,xt_multiport,xt_state,xt_tcpudp,xt_physdev,iptable_mangle,iptable_nat,iptable_filter,ip_tables
cls_u32                12968  4 
sch_multiq             13201  3 
sch_prio               13163  1 
bridge                 65614  0 
stp                    12392  1 bridge
loop                   22479  0 
snd_pcm                67276  0 
snd_timer              22658  1 snd_pcm
radeon                743013  1 
ipmi_si                36432  0 
ttm                    52224  1 radeon
drm_kms_helper         26950  1 radeon
tpm_tis                13125  0 
tpm                    17756  1 tpm_tis
ipmi_msghandler        35746  1 ipmi_si
tpm_bios               12903  1 tpm
snd                    52324  2 snd_pcm,snd_timer
drm                   166500  3 radeon,ttm,drm_kms_helper
soundcore              13014  1 snd
i2c_algo_bit           12834  1 radeon
i2c_core               23766  4 radeon,drm_kms_helper,drm,i2c_algo_bit
i7core_edac            18121  0 
edac_core              35344  1 i7core_edac
power_supply           13475  1 radeon
evdev                  17475  2 
snd_page_alloc         12969  1 snd_pcm
power_meter            17382  0 
pcspkr                 12579  0 
hpilo                  12889  0 
hpwdt                  12852  0 
container              12581  0 
psmouse                55199  0 
button                 12895  0 
processor              27431  8 
serio_raw              12878  0 
ext3                  112254  4 
jbd                    41698  1 ext3
mbcache                12930  1 ext3
dm_mod                 62468  9 
sg                     25769  0 
sr_mod                 21824  0 
cdrom                  35134  1 sr_mod
ata_generic            12479  0 
usbhid                 39946  0 
hid                    72745  1 usbhid
uhci_hcd               26290  0 
sd_mod                 35644  4 
ata_piix               25319  0 
crc_t10dif             12348  1 sd_mod
libata                151572  2 ata_generic,ata_piix
ehci_hcd               39487  0 
usbcore               127203  4 usbhid,uhci_hcd,ehci_hcd
bnx2                   62576  0 
hpsa                   39800  3 
scsi_mod              161557  5 sg,sr_mod,sd_mod,libata,hpsa
thermal                17330  0 
thermal_sys            17939  2 processor,thermal

-- System Information:
Debian Release: 6.0.2
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.39-bpo.2-amd64 (SMP w/8 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages linux-image-2.6.39-bpo.2-amd64 depends on:
ii  debconf [debconf-2.0]       1.5.36.1     Debian configuration management sy
ii  initramfs-tools [linux-init 0.99~bpo60+1 tools for generating an initramfs
ii  linux-base                  3.3~bpo60+1  Linux image base package
ii  module-init-tools           3.12-1       tools for managing Linux kernel mo

Versions of packages linux-image-2.6.39-bpo.2-amd64 recommends:
ii  firmware-linux-free           2.6.32-35  Binary firmware for various driver

Versions of packages linux-image-2.6.39-bpo.2-amd64 suggests:
ii  grub-pc                 1.98+20100804-14 GRand Unified Bootloader, version 
pn  linux-doc-2.6.39        <none>           (no description available)

Versions of packages linux-image-2.6.39-bpo.2-amd64 is related to:
ii  firmware-bnx2               0.32~bpo60+1 Binary firmware for Broadcom NetXt
pn  firmware-bnx2x              <none>       (no description available)
pn  firmware-ipw2x00            <none>       (no description available)
pn  firmware-ivtv               <none>       (no description available)
pn  firmware-iwlwifi            <none>       (no description available)
pn  firmware-linux              <none>       (no description available)
ii  firmware-linux-nonfree      0.32~bpo60+1 Binary firmware for various driver
pn  firmware-qlogic             <none>       (no description available)
pn  firmware-ralink             <none>       (no description available)
pn  xen-hypervisor              <none>       (no description available)

-- debconf information:
* linux-image-2.6.39-bpo.2-amd64/postinst/missing-firmware-2.6.39-bpo.2-amd64:
  linux-image-2.6.39-bpo.2-amd64/postinst/ignoring-ramdisk:
  linux-image-2.6.39-bpo.2-amd64/postinst/depmod-error-initrd-2.6.39-bpo.2-amd64: false
  linux-image-2.6.39-bpo.2-amd64/prerm/removing-running-kernel-2.6.39-bpo.2-amd64: true



--- End Message ---
--- Begin Message ---
On Sun, 2011-08-28 at 22:40 +0200, Dirk Weinhardt wrote:
> Hi Ben,
> 
> > Please can you test whether this is fixed in Linux 3.0 (available in
> > testing and unstable).
> 
> I installed the 3.0.0.1 testing kernel on the squeeze box and reran the 
> test. The number of directories in /cgroup does not increase even after 
> opening and closing 50 connections to vsftpd (using nagios' check_ftp 
> command).
> 
> Install command was: apt-get -t testing --no-install-recommends install 
> linux-image-3.0.0.1-amd64 firmware-linux
> 
> xxx:~# uname -a
> Linux xxx 3.0.0-1-amd64 #1 SMP Sun Jul 24 02:24:44 UTC 2011 x86_64 GNU/Linux
> 
> Would the test need to be done on a system that was entirely upgraded to 
> testing or even cleanly installed from testing?

No, that's not necessary.

> I noticed that after upgrading to Linux 3.0 starting any LXC container 
> fails with this output:
> 
> xxx:~# lxc-start -n vm0
> lxc-start: No such file or directory - failed to rename cgroup 
> /cgroup/7587->/cgroup/vm0
> lxc-start: failed to spawn 'vm0'
> lxc-start: No such file or directory - failed to remove cgroup '/cgroup/vm0'
> 
> Is any additional information or testing required?

That appears to be a bug in LXC:

https://bbs.archlinux.org/viewtopic.php?pid=975354#p975354

But I'm not sure.

Ben.

Attachment: signature.asc
Description: This is a digitally signed message part


--- End Message ---

Reply to: