Bug#283107: telnet-connections to some hosts fail
On Mon, 18 Apr 2005, Raoul Borenius wrote:
Hi Jurij,
On Fri, Apr 15, 2005 at 12:46:30AM -0400, Jurij Smakov wrote:
Hi Raoul,
Could you please check whether the situation have improved with the
current Debian kernels in testing/unstable?
Sorry, no improvement with
kernel-image-2.4.27-2-sparc64 (2.4.27-9)
kernel-image-2.6.8-2-sparc64 (2.6.8-15)
kernel-image-2.6.10-1-sparc64 (2.6.10-6)
If it helps I could give you ssh-access to our test-sparc if you send me your
IP-Address.
Thanks for giving me a chance to play with your setup. When I ran tcpdump
with a more verbose level, that's what I've seen at the point when the
connection hangs (lines broken for clarity):
03:44:49.038544 IP (tos 0x0, ttl 64, id 49580, offset 0, flags [DF],
length: 42) testsparc.32782 > m-nas1.telnet: P [bad tcp cksum b6a2 (->5563)!]
31:33(2) ack 77 win 5840
0x0000: 4500 002a c1ac 4000 4006 c29b c25f edfe E..*..@.@...._..
0x0010: bc01 4a26 800e 0017 7041 0769 6dee 1a53 ..J&....pA.im..S
0x0020: 5018 16d0 b6a2 0000 0d00 P.........
03:44:49.245092 IP (tos 0x0, ttl 64, id 49582, offset 0, flags [DF],
length: 42) testsparc.32782 > m-nas1.telnet: P [bad tcp cksum b6a2 (->5563)!]
31:33(2) ack 77 win 5840
0x0000: 4500 002a c1ae 4000 4006 c299 c25f edfe E..*..@.@...._..
0x0010: bc01 4a26 800e 0017 7041 0769 6dee 1a53 ..J&....pA.im..S
0x0020: 5018 16d0 b6a2 0000 0d00 P.........
03:44:49.659025 IP (tos 0x0, ttl 64, id 49584, offset 0, flags [DF],
length: 42) testsparc.32782 > m-nas1.telnet: P [bad tcp cksum b6a2 (->5563)!]
31:33(2) ack 77 win 5840
0x0000: 4500 002a c1b0 4000 4006 c297 c25f edfe E..*..@.@...._..
0x0010: bc01 4a26 800e 0017 7041 0769 6dee 1a53 ..J&....pA.im..S
0x0020: 5018 16d0 b6a2 0000 0d00 P.........
and so on. So it looks like _outgoing_ packets have bad TCP checksum and
probably are discarded by the other host. I believe that tcpdump captures
them after they hit the wire, so that leaves two possibilities: either
kernel constructs broken packets to be transmitted (which is somewhat
unrealistic, we would have plenty of bug reports in that case) or hardware
failure. Given that you see the bug on a number of different kernels, I
tend to believe the latter. Do you have an identical machine which you
could put in instead of testsparc and give it a go to eliminate this
possibility?
Best regards,
Jurij Smakov jurij@wooyd.org
Key: http://www.wooyd.org/pgpkey/ KeyID: C99E03CC
Reply to: