Hello, I'd like to submit you a problem involving the fork system call. I have a Samba 2.2.x server (official package from Debian Woody) which hosts Windows user home directories. There can be around 100 clients using the server at the same time. Sometimes, usually when lots of people are using the samba server, the server would refuse new connections. If I try to login on the console at the same time, I get the following error message : bash: fork: Ressource temporairement indisponible (in English : bash: fork: Ressource temporarily unavailable) and it is not possible for new clients to access the home samba share, what happens almost every day. If I login as root while the network activity is low and wait until the server starts refusing new client connections again, I get the same kind of errors : # cat /proc/sys/fs/file-nr 2050 3 32000 # ps aux bash: fork: Ressource temporairement indisponible # ls bash: fork: Ressource temporairement indisponible # which bash: fork: Ressource temporairement indisponible # jobs # kill kill: usage: kill [-s sigspec | -n signum | -sigspec] [pid | job]... or kill -l [sigspec] # ls bash: fork: Ressource temporairement indisponible I had a look at the implementation of the fork syscall on linux. In kernel/fork.c, function do_fork (line 633 on linux 2.4.21) stands : * Check if we are over our maximum process limit, but be sure to * exclude root. So I assumed the problem was related neither to pam_limits nor ulimit process limitations. It seems, that samba starts a new root process and a new user process for each new connection, und soon enough, I have around a 100 processes on the server. I already set the kernel file-max (integer) to 32760 with sysctl. In the kernel source, I figured that the fork systemcall could fail because of nr_threads being too low. But on the server, the /proc value for threads-max lies between 8000 and 9000. I had a cron job running on the server and logging out the output of w, free, vmstat and ps aux every 5 minutes to a file. I attached to the email the log corresponding to the time interval 14:05 - 15:55. Between 14:15 and 14:30, at 14:35 and 14:45 the script didn't output anything to the log file probably because it couldn't be called (fork: ressource unavailable). As you can see, there is still plenty of free memory space and there aren't any defunct / zombie samba process. Attached to this mail file are also the samba logs corresponding to the time interval in which the server was unavailable. Do you have any idea about what could make the fork fail ? Which direction would you investigate ? Thanking you very much in advance, -- Jean HAUSSER
Attachment:
mail_samba.ll.log
Description: Cron job (running ps, w, free) output
Attachment:
mail_samba.smbd.log
Description: Samba's smbd log file