[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Exim "closing connection" for no apparent reason



On Sun, Feb 16, 2003 at 08:14:15PM +0000, Pigeon wrote:
> On Sat, Feb 15, 2003 at 10:44:06PM -0600, Nathan E Norman wrote:
> > On Sun, Feb 16, 2003 at 02:22:58AM +0000, Pigeon wrote:
> > > On Sat, Feb 15, 2003 at 06:09:08PM +0000, Pigeon wrote:
> > > > If I now reboot nestie and try exim -qf again, all the mails are
> > > > forwarded to pigeon quite happily, even though there are probably many
> > > > more than 40 (the number at which I first had the problem). This
> > > > suggests to me that the problem is not really pigeon refusing
> > > > connection, but nestie ceasing to be able to make the connection.
> > > > Running nestie's /etc/init.d/networking restart makes no difference.
> > > 
> > > It just happened again... only this time rebooting nestie didn't work,
> > > rebooting pigeon did. Argh!
> > > 
> > > Forgot to mention in the original post: nestie is headless, I ssh into
> > > it. Ping, the ssh connection, nfs and web browsing all still work -
> > > it's only exim that's caught a cold.
> > 
> > Are you sure this isn't an exim config issue?  It's been a long time
> > since I used exim but IIRC it has a limit on concurrent deliveries.
> 
> Well, I've tried these settings
> 
>   smtp_connect_backlog = 50
>   smtp_accept_queue_per_connection = 1000
>   smtp_accept_max = 0
>   log_smtp_connections = true
>   
> without success.
> 
> Having seen it happen a few more times, it stops working quite
> consistently after the 38th message, except in the case where one of
> the first 38 messages was 400k of some stupid microfots virus that
> took some time to download; that time it stopped at about 67.
> 
> Further experimentation leads me to think that it may be something to
> do with whatever magic starts exim in response to an smtp connection.
> Normally, I have no exim daemon running, according to ps ax, either on
> nestie or pigeon. Like this (from nestie):

[ snip ps output ]

> But from either machine, I can telnet port 25 of the other machine,
> and get a response from Exim:
> 
> Trying 192.168.1.2...
> Connected to nestie.pigeonloft.
> Escape character is '^]'.
> 220 nestie.pigeonloft ESMTP Exim 3.35 #1 Sun, 16 Feb 2003 20:03:34 +0000
> 
> - and the same going from nestie to pigeon, ie. the normal direction
> of mail transfer.
> 
> Having made this connection, if I then do ps ax on the remote machine,
> it shows an exim -bs process as having magically started up in
> response to my connection.
> 
> When The Problem Strikes, trying to telnet port 25 of pigeon from
> nestie gives me "Connection refused", and obviously no exim -bs
> process. But I find I can start exim -bd from the command line on
> pigeon, and that gets it working again; all the mail comes across no
> problem. But the exim -bd daemon is always there of course, it doesn't
> magically start and stop in response to SMTP connections.
> 
> I guess there must be something in that apparently innocent list of
> processes which is responsible for the exim -bs magic, and it's the
> configuration of that which is running out of mana? But I don't know
> which config to check.

I think I've got it ... are you running exim from inetd?  If so, my
guess is that you're running into the respawn limit of inetd itself,
since fetchmail is opening a connection for each message.

If you are running from inetd, there are two paths you can take to
resolve this issue:

1) Run exim as a daemon instead.

2) Edit your /etc/inetd.conf to allow more than 40 connections within
one minute; I'll quote the relevant part of man 8 inetd:

     The wait/nowait entry is applicable to datagram sockets only (other
     sockets should have a ``nowait'' entry in this space). If a
     datagram server connects to its peer, freeing the socket so inetd
     can received further messages on the socket, it is said to be a
     ``multi-threaded'' server, and should use the ``nowait'' entry. For
     datagram servers which process all incoming datagrams on a socket
     and eventually time out, the server is said to be
     ``single-threaded'' and should use a ``wait'' entry. Comsat(8)
     (biff(1)) and talkd(8) are both examples of the latter type of
     datagram server. Tftpd(8) is an exception; it is a datagram server
     that establishes pseudo-connections. It must be listed as ``wait''
     in order to avoid a race; the server reads the first packet,
     creates a new socket, and then forks and exits to allow inetd to
     check for new service requests to spawn new servers. The optional
     ``max'' suffix (separated from ``wait'' or ``nowait'' by a dot)
     specifies the maximum number of server instances that may be
     spawned from inetd within an interval of 60 sec­ onds. When
     omitted, ``max'' defaults to 40.

I knew I'd seen the constant 40 somewhere; my apologies for mistakenly
thinking it was an exim issue.

-- 
Nathan Norman - Incanus Networking mailto:nnorman@incanus.net
  No.
  > Should I include quotations after my reply?



Reply to: