Re: Exim "closing connection" for no apparent reason
On Sun, Feb 16, 2003 at 08:14:15PM +0000, Pigeon wrote:
> On Sat, Feb 15, 2003 at 10:44:06PM -0600, Nathan E Norman wrote:
> > On Sun, Feb 16, 2003 at 02:22:58AM +0000, Pigeon wrote:
> > > On Sat, Feb 15, 2003 at 06:09:08PM +0000, Pigeon wrote:
> > > > If I now reboot nestie and try exim -qf again, all the mails are
> > > > forwarded to pigeon quite happily, even though there are probably many
> > > > more than 40 (the number at which I first had the problem). This
> > > > suggests to me that the problem is not really pigeon refusing
> > > > connection, but nestie ceasing to be able to make the connection.
> > > > Running nestie's /etc/init.d/networking restart makes no difference.
> > >
> > > It just happened again... only this time rebooting nestie didn't work,
> > > rebooting pigeon did. Argh!
> > >
> > > Forgot to mention in the original post: nestie is headless, I ssh into
> > > it. Ping, the ssh connection, nfs and web browsing all still work -
> > > it's only exim that's caught a cold.
> >
> > Are you sure this isn't an exim config issue? It's been a long time
> > since I used exim but IIRC it has a limit on concurrent deliveries.
>
> Well, I've tried these settings
>
> smtp_connect_backlog = 50
> smtp_accept_queue_per_connection = 1000
> smtp_accept_max = 0
> log_smtp_connections = true
>
> without success.
>
> Having seen it happen a few more times, it stops working quite
> consistently after the 38th message, except in the case where one of
> the first 38 messages was 400k of some stupid microfots virus that
> took some time to download; that time it stopped at about 67.
>
> Further experimentation leads me to think that it may be something to
> do with whatever magic starts exim in response to an smtp connection.
> Normally, I have no exim daemon running, according to ps ax, either on
> nestie or pigeon. Like this (from nestie):
[ snip ps output ]
> But from either machine, I can telnet port 25 of the other machine,
> and get a response from Exim:
>
> Trying 192.168.1.2...
> Connected to nestie.pigeonloft.
> Escape character is '^]'.
> 220 nestie.pigeonloft ESMTP Exim 3.35 #1 Sun, 16 Feb 2003 20:03:34 +0000
>
> - and the same going from nestie to pigeon, ie. the normal direction
> of mail transfer.
>
> Having made this connection, if I then do ps ax on the remote machine,
> it shows an exim -bs process as having magically started up in
> response to my connection.
>
> When The Problem Strikes, trying to telnet port 25 of pigeon from
> nestie gives me "Connection refused", and obviously no exim -bs
> process. But I find I can start exim -bd from the command line on
> pigeon, and that gets it working again; all the mail comes across no
> problem. But the exim -bd daemon is always there of course, it doesn't
> magically start and stop in response to SMTP connections.
>
> I guess there must be something in that apparently innocent list of
> processes which is responsible for the exim -bs magic, and it's the
> configuration of that which is running out of mana? But I don't know
> which config to check.
I think I've got it ... are you running exim from inetd? If so, my
guess is that you're running into the respawn limit of inetd itself,
since fetchmail is opening a connection for each message.
If you are running from inetd, there are two paths you can take to
resolve this issue:
1) Run exim as a daemon instead.
2) Edit your /etc/inetd.conf to allow more than 40 connections within
one minute; I'll quote the relevant part of man 8 inetd:
The wait/nowait entry is applicable to datagram sockets only (other
sockets should have a ``nowait'' entry in this space). If a
datagram server connects to its peer, freeing the socket so inetd
can received further messages on the socket, it is said to be a
``multi-threaded'' server, and should use the ``nowait'' entry. For
datagram servers which process all incoming datagrams on a socket
and eventually time out, the server is said to be
``single-threaded'' and should use a ``wait'' entry. Comsat(8)
(biff(1)) and talkd(8) are both examples of the latter type of
datagram server. Tftpd(8) is an exception; it is a datagram server
that establishes pseudo-connections. It must be listed as ``wait''
in order to avoid a race; the server reads the first packet,
creates a new socket, and then forks and exits to allow inetd to
check for new service requests to spawn new servers. The optional
``max'' suffix (separated from ``wait'' or ``nowait'' by a dot)
specifies the maximum number of server instances that may be
spawned from inetd within an interval of 60 sec onds. When
omitted, ``max'' defaults to 40.
I knew I'd seen the constant 40 somewhere; my apologies for mistakenly
thinking it was an exim issue.
--
Nathan Norman - Incanus Networking mailto:nnorman@incanus.net
No.
> Should I include quotations after my reply?
Reply to: