[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#572201: forcedeth driver hangs under heavy load



Eric Dumazet wrote:
Le lundi 12 avril 2010 à 14:19 +0100, stephen mulcahy a écrit :

Do you have some netfilters rules ?


Hi Eric,

I don't have any netfilters rules:

root@node34:~# for table in filter nat mangle raw; do iptables -t $table -L; done
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination

Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination


I re-ran this on the 2.6.32 kernel (with the 2.6.32 forcedeth module) just in case that was screwing something up.

node33 is in the unresponsive state this time. I'm running tcpdump on node34. on node33 I try to ssh to node34 (using ip address of node34). I note that I can ping between node33 and node34.

root@node34:~# tcpdump -v host node34 and node33
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes 17:05:19.622384 IP (tos 0x0, ttl 64, id 21435, offset 0, flags [DF], proto TCP (6), length 60) node33.webstar.cnet.43653 > node34.ssh: Flags [S], cksum 0xb994 (correct), seq 1675314077, win 5840, options [mss 1460,sackOK,TS val 331814 ecr 0,nop,wscale 7], length 0 17:05:19.622754 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60) node34.ssh > node33.webstar.cnet.43653: Flags [S.], cksum 0x9d81 (correct), seq 1669769379, ack 1675314078, win 5792, options [mss 1460,sackOK,TS val 331779 ecr 331814,nop,wscale 7], length 0 17:05:19.622813 IP (tos 0x0, ttl 64, id 21436, offset 0, flags [DF], proto TCP (6), length 52) node33.webstar.cnet.43653 > node34.ssh: Flags [.], cksum 0xe2bf (correct), ack 1, win 46, options [nop,nop,TS val 331814 ecr 331779], length 0 17:05:19.627666 IP (tos 0x0, ttl 64, id 47271, offset 0, flags [DF], proto TCP (6), length 84) node34.ssh > node33.webstar.cnet.43653: Flags [P.], seq 1:33, ack 1, win 46, options [nop,nop,TS val 331780 ecr 331814], length 32 17:05:19.627748 IP (tos 0x0, ttl 64, id 21437, offset 0, flags [DF], proto TCP (6), length 52) node33.webstar.cnet.43653 > node34.ssh: Flags [.], cksum 0xe29c (correct), ack 33, win 46, options [nop,nop,TS val 331816 ecr 331780], length 0 17:05:19.627833 IP (tos 0x0, ttl 64, id 21438, offset 0, flags [DF], proto TCP (6), length 84, bad cksum 1f8a (->d189)!) node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq 23413:23445, ack 2749038625, win 46, options [nop,nop,TS val 331816 ecr 331780], length 32 17:05:19.831634 IP (tos 0x0, ttl 64, id 21439, offset 0, flags [DF], proto TCP (6), length 84, bad cksum d189 (->d188)!) node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq 1:33, ack 33, win 46, options [nop,nop,TS val 331867 ecr 331780], length 32 17:05:20.239603 IP (tos 0x0, ttl 64, id 21440, offset 0, flags [DF], proto TCP (6), length 84, bad cksum 15c6 (->d187)!) node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq 30492:30524, ack 809893921, win 46, options [nop,nop,TS val 331969 ecr 331780], length 32 17:05:21.055534 IP (tos 0x0, ttl 64, id 21441, offset 0, flags [DF], proto TCP (6), length 84, bad cksum d187 (->d186)!) node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq 1:33, ack 33, win 46, options [nop,nop,TS val 332173 ecr 331780], length 32 17:05:22.687386 IP (tos 0x0, ttl 64, id 21442, offset 0, flags [DF], proto TCP (6), length 84, bad cksum d186 (->d185)!) node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq 1:33, ack 33, win 46, options [nop,nop,TS val 332581 ecr 331780], length 32 17:05:25.950935 IP (tos 0x0, ttl 64, id 21443, offset 0, flags [DF], proto TCP (6), length 84, bad cksum 15c4 (->d184)!) node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq 30492:30524, ack 809893921, win 46, options [nop,nop,TS val 333397 ecr 331780], length 32 17:05:32.478527 IP (tos 0x0, ttl 64, id 21444, offset 0, flags [DF], proto TCP (6), length 84, bad cksum c01 (->d183)!) node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq 43997:44029, ack 1311047713, win 46, options [nop,nop,TS val 335029 ecr 331780], length 32 17:05:45.533370 IP (tos 0x0, ttl 64, id 21445, offset 0, flags [DF], proto TCP (6), length 84, bad cksum 23d (->d182)!) node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq 3348:3380, ack 4054450209, win 46, options [nop,nop,TS val 338293 ecr 331780], length 32 17:06:08.719187 IP (tos 0x0, ttl 64, id 27660, offset 0, flags [DF], proto TCP (6), length 1500, bad cksum 5360 (->b3b3)!) node33.webstar.cnet.50060 > node34.35725: Flags [.], seq 1203473738:1203475186, ack 1191452767, win 54, options [nop,nop,TS val 344089 ecr 256770], length 1448 17:06:11.643080 IP (tos 0x0, ttl 64, id 21446, offset 0, flags [DF], proto TCP (6), length 84, bad cksum e4f2 (->d181)!) node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq 47331:47363, ack 4110811169, win 46, options [nop,nop,TS val 344821 ecr 331780], length 32 17:06:13.715233 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has node34 tell node33.webstar.cnet, length 46 17:06:13.715257 ARP, Ethernet (len 6), IPv4 (len 4), Reply node34 is-at 00:30:48:f0:06:72 (oui Unknown), length 28 17:07:03.866492 IP (tos 0x0, ttl 64, id 21447, offset 0, flags [DF], proto TCP (6), length 84, bad cksum b413 (->d180)!) node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq 28939:28971, ack 1913782305, win 46, options [nop,nop,TS val 357877 ecr 331780], length 32 17:07:08.862055 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has node34 tell node33.webstar.cnet, length 46 17:07:08.862370 ARP, Ethernet (len 6), IPv4 (len 4), Reply node34 is-at 00:30:48:f0:06:72 (oui Unknown), length 28 17:07:19.627910 IP (tos 0x0, ttl 64, id 47272, offset 0, flags [DF], proto TCP (6), length 52) node34.ssh > node33.webstar.cnet.43653: Flags [F.], cksum 0x6d6b (correct), seq 33, ack 1, win 46, options [nop,nop,TS val 361780 ecr 331816], length 0 17:07:19.628403 IP (tos 0x0, ttl 64, id 21448, offset 0, flags [DF], proto TCP (6), length 844, bad cksum aa4d (->ce87)!) node33.webstar.cnet.43653 > node34.ssh: Flags [FP.], seq 20399:21191, ack 2356871202, win 46, options [nop,nop,TS val 361818 ecr 361780], length 792 17:07:19.833456 IP (tos 0x0, ttl 64, id 47273, offset 0, flags [DF], proto TCP (6), length 52) node34.ssh > node33.webstar.cnet.43653: Flags [F.], cksum 0x6d37 (correct), seq 33, ack 1, win 46, options [nop,nop,TS val 361832 ecr 331816], length 0 17:07:19.833517 IP (tos 0x0, ttl 64, id 21449, offset 0, flags [DF], proto TCP (6), length 64) node33.webstar.cnet.43653 > node34.ssh: Flags [.], cksum 0xa5e9 (correct), ack 34, win 46, options [nop,nop,TS val 361870 ecr 361832,nop,nop,sack 1 {33:34}], length 0

At this point, I see a "Connection closed by 10.141.0.34" message on node33 (from where I am attempting to ssh).

Again, if I ifdown on node33 and ifup again - I can then see from node33 to node34 without problems.

-stephen



Reply to: