[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

PPP stops passing data after 100kb



I am having a very strange problem, which I am having a very difficult
time diagnosing.  I recently migrated my PPP server (5 dialin lines)
from an old Pentium 100 to not quite so old dual PII.

I certainly won't say the two setups are "identical", but they are
similar---I pretty much just copied /etc/ppp/options from one machine
to the other.  I am using mgetty to answer the phone, and then a_ppp
to start ppp, etc.  This all works fine, the lines are answered, users
are authenticated, ppp is started.

The problem occurs after ppp is running.  Everything seems to work
just fine for a the first 10-1000k or so, but then data stops
flowing.  ping times are about 170ms, with no packet loss, and then
they go to 170ms with 50% packet loss.  If this were a phone line
problem, I would expect the ping times to go up as the modems
retransmit data (error correction) but to have no packet loss.

A tcpdump on the interface shows, for example:
(10.0.0.110 is connected by ppp, 10.0.0.97 is a machine that is on the
same subnet as the ppp server)

11:54:19.654643 10.0.0.110 > 10.0.0.97: icmp: echo request
11:54:20.314760 10.0.0.97 > 10.0.0.110: icmp: echo reply
11:54:21.004675 10.0.0.110 > 10.0.0.97: icmp: echo request
11:54:21.004852 10.0.0.97 > 10.0.0.110: icmp: echo reply
11:54:22.004701 10.0.0.110 > 10.0.0.97: icmp: echo request
11:54:22.004923 10.0.0.97 > 10.0.0.110: icmp: echo reply
11:54:23.004728 10.0.0.110 > 10.0.0.97: icmp: echo request
11:54:23.004949 10.0.0.97 > 10.0.0.110: icmp: echo reply

looks fine, but only two of the four echo replies made it across the
ppp connection.  A ping going the other direction looks bad too:

11:56:22.990528 10.0.0.97 > 10.0.0.110: icmp: echo request (DF)
11:56:23.187445 10.0.0.110 > 10.0.0.97: icmp: echo reply (DF)
11:56:23.986612 10.0.0.97 > 10.0.0.110: icmp: echo request (DF)
11:56:24.167481 10.0.0.110 > 10.0.0.97: icmp: echo reply (DF)
11:56:24.986000 10.0.0.97 > 10.0.0.110: icmp: echo request (DF)
11:56:25.985988 10.0.0.97 > 10.0.0.110: icmp: echo request (DF)

as can be seen, that is 50% packet loss.  I get the same pattern if I
ping between the ppp client and server, instead of a machine on the
ppp server's subnet.

It is as if the ppp server gets bored with sending packets out the ppp
interface after a time.  It doesn't seem to be a routing issue (at
least initially), because everything works perfectly for the first few
hundred kb.  I can't figure out what might be changing after the
connection is up that could cause this problem.

The connections don't time out from lcp-echo-failures, so it seems
that the ppp layer is intact.  It really seems like something in the
kernel decides to stop moving packets.  I don't think it is the
ethernet driver on the ppp server (tulip) as packets that originate
(or are destined) for the ppp server show the same behavior when
crossing the ppp link.  I have disabled all firewalling rules and
iptables, just to make sure that isn't screwing me up.

This occurs with kernel 2.4.13-ac6 and 2.4.14, as well as ppp 2.4.0f
and 2.4.1.  Here are the options in effect:

Nov  6 11:46:35 linux pppd[5157]: pppd options in effect:
Nov  6 11:46:35 linux pppd[5157]: debug^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: kdebug 1^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: ktune^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: dump^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: nomultilink^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: +pap^I^I# (from command line)
Nov  6 11:46:35 linux pppd[5157]: -chap^I^I# (from command line)
Nov  6 11:46:35 linux pppd[5157]: login^I^I# (from command line)
Nov  6 11:46:35 linux pppd[5157]: lock^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: crtscts^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: modem^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: asyncmap 0^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: lcp-echo-failure 4^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: lcp-echo-interval 30^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: hide-password^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: ipcp-accept-remote^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: ms-dns xxx # [don't know how to print value]^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: ms-wins xxx # [don't know how to print value]^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: proxyarp^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: netmask 255.255.255.0^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: 10.0.0.69:10.0.0.110^I^I# (from /etc/ppp/options.ttyR0)
Nov  6 11:46:35 linux pppd[5157]: bsdcomp 15^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: deflate 15^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: noipx^I^I# (from /etc/ppp/options)
Nov  6 11:46:35 linux pppd[5157]: pppd 2.4.1 started by a_ppp, uid 0

Any advice would be greatly appreciated.  I apologize if you see this
message multiple times, but I am sending it to several different
lists, as I am totally at a loss on how to proceed.  Hopefully the
solution is some brain fart on my case, such as
echo 1 > /proc/sys/net/ipv4/ppp_should_work

--
Jeff Lessem.



Reply to: