Re: Strange networking behavior
Hi
On Mon, Feb 03, 2014 at 09:56:24AM +0100, Florian Götz wrote:
> Hi Debian-Users,
> 
> I got a network problem with one of my Debian VMs.
> The VM runs on an ESX Host (5.1) with several other VMs (SLES11).
> 
> It´s a Debian 6 (can´t upgrade due to errors with the other running
> software at the moment) which hosts a Network Management System
> (Opsview) based on nagios.
Oh well. For this problem I suspect that the debian version is
irrelevant.
> If I disconnect a network switch in another building several other hosts
> get unavailable (are not pingable anymore).
So the "other hosts" I assume are on a "remote" subnet (from the
debian box's point-of-view).
> If I try to ping these hosts from one of the SLES machines on the same
> ESX host they are reachable.
The SLES hosts: are they on the same subnet as the debian box?
Try traceroute - two runs from the "other hosts":
- one towards one of the SLES machines
- one towards the debian box
and check for differences in the output. Assuming that the SLES
machines are on the same subnet as the debian box, the route should be
the same. The interesting point is where they differ...
> So it isn´t a failure in ESX Network Configuration etc.
Possibly. If the debian box is on the same subnet (same vlan & ip
range), then yes: This would most likely exclude the esx network side
of things.  If the debian box is on a different subnet, then this is
still an open question.
 
> It takes about an hour before the hosts get back to state "pingable" on
> the Debian machine.
a whole hour!? That is way too long for spanning tree to settle
down. So spanning tree/routing can probably be eliminated as suspects.
A hunch: Do you have a duplicate IP address on subnet where the debian
box lives?  These can be hairy to debug... (but 1 hour is a lot even
for this type of problem).
The only reliable way I have found is to take the suspect box down,
and see whether the IP address still responds to ARP requests
(obviously from the same local subnet). If it does, then you can use
the MAC address to track down the other box with that IP address.
Alternatively, try ping/arping it from neighbouring hosts (=same
subnet) and check they all get the same MAC address for the debian
box? (this is pretty much pot luck and probably depends on the switch
behaviour).
> So for any reason the debian host can´t get to these hosts, but after a
> sort of random amount of time everything is fine again.
"Randomness" points towards cache of some sort... Somewhere.
Hopefully traceroute can point in the right direction.
> 
> Anyone got a hint where to search for a solution to that?
> 
> 
> Best regards
> Florian Götz
> 
> 
> 
> -- 
> Mit freundlichen Grüßen
> Florian Götz
> 
> 
> -----------------------------------------------------------------
> 
> Dipl.-Inf. (FH) Florian Götz
> Rechenzentrum Hochschule Mannheim
> Paul-Wittsack-Straße 10    
> 68163 Mannheim
> Tel: 0621/292-6232
> 
> EMail:     f.goetz@hs-mannheim.de
> Internet:     http://www.rz.hs-mannheim.de
> 
> -----
> 
> 
-- 
Karl E. Jorgensen
Reply to: