[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Exim "closing connection" for no apparent reason



On Sat, Feb 15, 2003 at 10:44:06PM -0600, Nathan E Norman wrote:
> On Sun, Feb 16, 2003 at 02:22:58AM +0000, Pigeon wrote:
> > On Sat, Feb 15, 2003 at 06:09:08PM +0000, Pigeon wrote:
> > > If I now reboot nestie and try exim -qf again, all the mails are
> > > forwarded to pigeon quite happily, even though there are probably many
> > > more than 40 (the number at which I first had the problem). This
> > > suggests to me that the problem is not really pigeon refusing
> > > connection, but nestie ceasing to be able to make the connection.
> > > Running nestie's /etc/init.d/networking restart makes no difference.
> > 
> > It just happened again... only this time rebooting nestie didn't work,
> > rebooting pigeon did. Argh!
> > 
> > Forgot to mention in the original post: nestie is headless, I ssh into
> > it. Ping, the ssh connection, nfs and web browsing all still work -
> > it's only exim that's caught a cold.
> 
> Are you sure this isn't an exim config issue?  It's been a long time
> since I used exim but IIRC it has a limit on concurrent deliveries.

Well, I've tried these settings

  smtp_connect_backlog = 50
  smtp_accept_queue_per_connection = 1000
  smtp_accept_max = 0
  log_smtp_connections = true
  
without success.

Having seen it happen a few more times, it stops working quite
consistently after the 38th message, except in the case where one of
the first 38 messages was 400k of some stupid microfots virus that
took some time to download; that time it stopped at about 67.

Further experimentation leads me to think that it may be something to
do with whatever magic starts exim in response to an smtp connection.
Normally, I have no exim daemon running, according to ps ax, either on
nestie or pigeon. Like this (from nestie):

  PID TTY      STAT   TIME COMMAND
    1 ?        S      0:05 init
    2 ?        SW     0:00 [keventd]
    3 ?        SWN    0:00 [ksoftirqd_CPU0]
    4 ?        SW     0:00 [kswapd]
    5 ?        SW     0:00 [bdflush]
    6 ?        SW     0:00 [kupdated]
    8 ?        SW     0:00 [scsi_eh_0]
    9 ?        SW     0:00 [kjournald]
  106 ?        S      0:00 /sbin/portmap
  171 ?        S      0:00 /sbin/syslogd
  174 ?        S      0:00 /sbin/klogd
  182 ?        S      0:00 /usr/sbin/dnsmasq -r /etc/ppp/resolv.conf
  185 ?        S      0:00 /sbin/rpc.statd
  192 ?        S      0:00 /usr/sbin/inetd
  196 ?        S      0:00 /usr/sbin/lpd
  201 ?        S      0:00 /usr/sbin/nmbd -D
  203 ?        S      0:00 /usr/sbin/smbd -D
  209 ?        S      0:00 /usr/sbin/sshd
  216 ?        S      0:00 /usr/sbin/atd
  219 ?        S      0:00 /usr/sbin/cron
  239 tty1     S      0:00 /sbin/getty 38400 tty1
  240 tty2     S      0:00 /sbin/getty 38400 tty2
  241 tty3     S      0:00 /sbin/getty 38400 tty3
  242 tty4     S      0:00 /sbin/getty 38400 tty4
  243 tty5     S      0:00 /sbin/getty 38400 tty5
  244 tty6     S      0:00 /sbin/getty 38400 tty6
  245 ?        S      0:00 /usr/sbin/sshd
  247 ?        S      0:00 /usr/sbin/sshd
  248 pts/0    S      0:00 -bash
  252 ?        S      0:00 /usr/sbin/sshd
  254 pts/1    S      0:00 -bash
 1170 pts/0    R      0:00 ps ax

But from either machine, I can telnet port 25 of the other machine,
and get a response from Exim:

Trying 192.168.1.2...
Connected to nestie.pigeonloft.
Escape character is '^]'.
220 nestie.pigeonloft ESMTP Exim 3.35 #1 Sun, 16 Feb 2003 20:03:34 +0000

- and the same going from nestie to pigeon, ie. the normal direction
of mail transfer.

Having made this connection, if I then do ps ax on the remote machine,
it shows an exim -bs process as having magically started up in
response to my connection.

When The Problem Strikes, trying to telnet port 25 of pigeon from
nestie gives me "Connection refused", and obviously no exim -bs
process. But I find I can start exim -bd from the command line on
pigeon, and that gets it working again; all the mail comes across no
problem. But the exim -bd daemon is always there of course, it doesn't
magically start and stop in response to SMTP connections.

I guess there must be something in that apparently innocent list of
processes which is responsible for the exim -bs magic, and it's the
configuration of that which is running out of mana? But I don't know
which config to check.

Pigeon



Reply to: