kernel 2.6.2: no arp replies with Intel Etherexpress 100 -> no connection; works fine with 2.4.23
Hi,
OS is Debian Woody + backports (mainly backports.org).
This machine has 2 NICs. It does IP masquerading for the internal LAN.
eth0 is a Broadcom 4400 onboard interface (on Asus P4PE). Kernel module
is b44. It is used by pppoe to connect to my ADSL provider.
eth1 is an Intel Etherexpress 100 card, using the e100 kernel module
(but using the eepro100 module doesn't help either). It connects to the
internal LAN.
/etc/network/interfaces:
auto lo
iface lo inet loopback
# eth0 is pppoe
auto eth1
iface eth1 inet static
address 192.168.1.1
netmask 255.255.255.0
network 192.168.1.0
broadcast 192.168.1.255
When trying to connect between 192.168.1.1 (masquerading host) and
192.168.1.2 (laptop) on the internal LAN, the situation is described in
detail further down. I get no connection in both directions, but the
tests below where done from 192.168.1.1 to 192.168.1.2. Just to be suree
I tried it with the laptop running self-compiled versions of 2.4.23 and
2.6.0-test11, plus the Debian image for 2.6.2. Everything works fine
with each of those kernels, as long as the masquerading host runs
2.4.23. All tests were done with all ICMP filtering off on the
masquerading host
*** 2.4.23 ***
Everthing works fine. The kernel is self-compiled (ipv6 is not compiled
in, in case this matters)
/etc/modules.conf:
### update-modules: start processing /etc/modutils/nics
alias eth1 e100
alias eth0 b44
Output from ifconfig:
eth0 Link encap:Ethernet HWaddr 00:E0:18:F4:0E:AC
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:286 errors:0 dropped:0 overruns:0 frame:0
TX packets:312 errors:0 dropped:0 overruns:0 carrier:2
collisions:0 txqueuelen:1000
RX bytes:70652 (68.9 KiB) TX bytes:23143 (22.6 KiB)
Interrupt:10
eth1 Link encap:Ethernet HWaddr 00:A0:C9:3B:00:87
inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:183 errors:0 dropped:0 overruns:0 frame:0
TX packets:173 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:14417 (14.0 KiB) TX bytes:16632 (16.2 KiB)
Interrupt:9 Base address:0xb000 Memory:f5000000-f5000038
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:934 errors:0 dropped:0 overruns:0 frame:0
TX packets:934 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:692767 (676.5 KiB) TX bytes:692767 (676.5 KiB)
ppp0 Link encap:Point-to-Point Protocol
inet addr:<external IP from ADSL> P-t-P:<IP of P-t-P
partner> Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1492 Metric:1
RX packets:224 errors:0 dropped:0 overruns:0 frame:0
TX packets:249 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:3
RX bytes:60800 (59.3 KiB) TX bytes:12577 (12.2 KiB)
While pinging from 192.168.1.1 to 192.168.1.2, I get this from tcpdump:
root@sonic: ~ # tcpdump host 192.168.1.1 and 192.168.1.2 -i eth1 -n
tcpdump: listening on eth1
15:58:19.473365 arp who-has 192.168.1.2 tell 192.168.1.1
15:58:19.473430 192.168.1.1 > 192.168.1.2: icmp: echo request (DF)
15:58:19.473593 arp reply 192.168.1.2 is-at 0:50:ba:77:77:cd
15:58:19.473694 192.168.1.2 > 192.168.1.1: icmp: echo reply
15:58:20.556722 192.168.1.1 > 192.168.1.2: icmp: echo request (DF)
15:58:20.556980 192.168.1.2 > 192.168.1.1: icmp: echo reply
[etc.]
60 packets received by filter
0 packets dropped by kernel
Looks fine and works.
On the other hand:
*** 2.6.2 ***
I get no connection to 192.168.1.2. The kernel is the image from
backports.org, with Herbert Xu's default config. Except the current
issue everything works fine.
/lib/modules/modprobe.conf
### update-modules: start processing ethernet ###
alias eth1 e100
alias eth0 b44
Output from ifconfig:
eth0 Link encap:Ethernet HWaddr 00:E0:18:F4:0E:AC
inet6 addr: fe80::2e0:18ff:fef4:eac/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:144 errors:0 dropped:0 overruns:0 frame:0
TX packets:152 errors:0 dropped:0 overruns:0 carrier:5
collisions:0 txqueuelen:1000
RX bytes:18886 (18.4 KiB) TX bytes:11171 (10.9 KiB)
Interrupt:20
eth1 Link encap:Ethernet HWaddr 00:A0:C9:3B:00:87
inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::2a0:c9ff:fe3b:87/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:68 errors:0 dropped:3 overruns:0 frame:3
TX packets:61 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:2730 (2.6 KiB)
Interrupt:22 Base address:0xb000 Memory:f5000000-f5000038
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:243 errors:0 dropped:0 overruns:0 frame:0
TX packets:243 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:20712 (20.2 KiB) TX bytes:20712 (20.2 KiB)
ppp0 Link encap:Point-to-Point Protocol
inet addr:<external IP from ADSL> P-t-P:<IP of P-t-P partner>
Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1492 Metric:1
RX packets:94 errors:0 dropped:0 overruns:0 frame:0
TX packets:97 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:3
RX bytes:13182 (12.8 KiB) TX bytes:4991 (4.8 KiB)
The problem is that while pinging from 192.168.1.1 to 192.168.1.2, I get
this:
root@sonic: ~ # tcpdump host 192.168.1.1 and 192.168.1.2 -i eth1 -n
tcpdump: listening on eth1
16:06:35.744986 arp who-has 192.168.1.2 tell 192.168.1.1
16:06:36.828116 arp who-has 192.168.1.2 tell 192.168.1.1
16:06:37.914525 arp who-has 192.168.1.2 tell 192.168.1.1
16:06:38.997629 arp who-has 192.168.1.2 tell 192.168.1.1
[etc.]
23 packets received by filter
0 packets dropped by kernel
There clearly is something fishy: I never get any arp replies, and the
ping fails (100% packet loss).
I have tried switching the aliases of eth0 and eth1 so that the Intel
card is eth0 and pointing to the DSL modem, while the Broadcom onboard
NIC is eth1 and pointing to the internal LAN. Result was as expected, I
couldn't connect to DSL.
This lead me to believe that something's not working with the Intel card
in kernel 2.6.2. I subsequently tried the eepro100 module (which I have
used successfully with this card in the past), but it didn't change
anything.
I am lost now and have no idea how to proceed. If you can't tell me
what's wrong, I'd appreciate pointers to what else to check or what
tests to run.
Regards, Mario
Reply to: