[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Weird network behaviour - can anyone explain it?



Oliver Elphick wrote:

We have a machine whose network configuration is in some way wrong, but
I don't know how.

When it boots, the network is configured correctly, according to
ifconfig, but it takes forever for things (a deliberately vague word) to
be processed.  Then it seems to handle a number of requests all at once
and goes back to sleep for a while.  The effect is illustrated by this
ping across the local ethernet (other machines on the same net have no
problems):

#  ping braydb
PING braydb.somedomain.com (192.168.1.18): 56 data bytes
64 bytes from 192.168.1.18: icmp_seq=33 ttl=64 time=0.6 ms
64 bytes from 192.168.1.18: icmp_seq=34 ttl=64 time=0.8 ms
64 bytes from 192.168.1.18: icmp_seq=35 ttl=64 time=1.4 ms
64 bytes from 192.168.1.18: icmp_seq=18 ttl=64 time=17002.5 ms
64 bytes from 192.168.1.18: icmp_seq=19 ttl=64 time=16003.6 ms
64 bytes from 192.168.1.18: icmp_seq=20 ttl=64 time=15004.2 ms
64 bytes from 192.168.1.18: icmp_seq=21 ttl=64 time=14004.8 ms
64 bytes from 192.168.1.18: icmp_seq=22 ttl=64 time=13005.4 ms
64 bytes from 192.168.1.18: icmp_seq=23 ttl=64 time=12005.9 ms
64 bytes from 192.168.1.18: icmp_seq=24 ttl=64 time=11006.5 ms
64 bytes from 192.168.1.18: icmp_seq=25 ttl=64 time=10007.0 ms
64 bytes from 192.168.1.18: icmp_seq=26 ttl=64 time=9007.6 ms
64 bytes from 192.168.1.18: icmp_seq=27 ttl=64 time=8008.1 ms
64 bytes from 192.168.1.18: icmp_seq=28 ttl=64 time=7008.6 ms
64 bytes from 192.168.1.18: icmp_seq=29 ttl=64 time=6009.1 ms
64 bytes from 192.168.1.18: icmp_seq=30 ttl=64 time=5009.7 ms
64 bytes from 192.168.1.18: icmp_seq=31 ttl=64 time=4010.3 ms
64 bytes from 192.168.1.18: icmp_seq=32 ttl=64 time=3010.9 ms
--- braydb.somedomain.com ping statistics ---
49 packets transmitted, 18 packets received, 63% packet loss
round-trip min/avg/max = 0.6/8339.2/17002.5 ms

After some considerable time, this effect stops and normal response
times resume.  (I hope that this will also be the case on this occasion;
the machine has been running for 5 hours so far.)

Kernel is 2.4.20 SMP, built for this machine.
I can't identify the network card until the machine starts to respond
correctly (I am not on site).

The problem began a couple of months back; I do not know of any relevant
software change.  Since then, the machine has not been rebooted again
until today.

If the other machines on the network were pinging the same remote machine just fine at the same time the above is happening then I'd suspect: network cables to this computer; port on the hub/switch that this computer plugs into; driver for this computer's ethernet card; this computers ethernet card. Trying a different port is a quick test, so is trying a different ethernet cable.

If the problem is one of the above guesses, you should get the same results pinging another local machine while the problem is showing up pinging that remote system.

The lag looks exactly like what I have seen in the past on oversold service or dslam issues on DSL (routing or network congestion issues).

Good luck.
--
Jacob



Reply to: