[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#283107: telnet-connections to some hosts fail



On Mon, 18 Apr 2005, Raoul Borenius wrote:

Hi Jurij,

On Fri, Apr 15, 2005 at 12:46:30AM -0400, Jurij Smakov wrote:
Hi Raoul,

Could you please check whether the situation have improved with the
current Debian kernels in testing/unstable?

Sorry, no improvement with

kernel-image-2.4.27-2-sparc64 (2.4.27-9)
kernel-image-2.6.8-2-sparc64 (2.6.8-15)
kernel-image-2.6.10-1-sparc64 (2.6.10-6)

If it helps I could give you ssh-access to our test-sparc if you send me your
IP-Address.

Thanks for giving me a chance to play with your setup. When I ran tcpdump with a more verbose level, that's what I've seen at the point when the connection hangs (lines broken for clarity):

03:44:49.038544 IP (tos 0x0, ttl  64, id 49580, offset 0, flags [DF],
length: 42) testsparc.32782 > m-nas1.telnet: P [bad tcp cksum b6a2 (->5563)!]
31:33(2) ack 77 win 5840
        0x0000:  4500 002a c1ac 4000 4006 c29b c25f edfe  E..*..@.@...._..
        0x0010:  bc01 4a26 800e 0017 7041 0769 6dee 1a53  ..J&....pA.im..S
        0x0020:  5018 16d0 b6a2 0000 0d00                 P.........
03:44:49.245092 IP (tos 0x0, ttl 64, id 49582, offset 0, flags [DF], length: 42) testsparc.32782 > m-nas1.telnet: P [bad tcp cksum b6a2 (->5563)!]
31:33(2) ack 77 win 5840
        0x0000:  4500 002a c1ae 4000 4006 c299 c25f edfe  E..*..@.@...._..
        0x0010:  bc01 4a26 800e 0017 7041 0769 6dee 1a53  ..J&....pA.im..S
        0x0020:  5018 16d0 b6a2 0000 0d00                 P.........
03:44:49.659025 IP (tos 0x0, ttl 64, id 49584, offset 0, flags [DF], length: 42) testsparc.32782 > m-nas1.telnet: P [bad tcp cksum b6a2 (->5563)!]
31:33(2) ack 77 win 5840
        0x0000:  4500 002a c1b0 4000 4006 c297 c25f edfe  E..*..@.@...._..
        0x0010:  bc01 4a26 800e 0017 7041 0769 6dee 1a53  ..J&....pA.im..S
        0x0020:  5018 16d0 b6a2 0000 0d00                 P.........

and so on. So it looks like _outgoing_ packets have bad TCP checksum and probably are discarded by the other host. I believe that tcpdump captures them after they hit the wire, so that leaves two possibilities: either kernel constructs broken packets to be transmitted (which is somewhat unrealistic, we would have plenty of bug reports in that case) or hardware failure. Given that you see the bug on a number of different kernels, I tend to believe the latter. Do you have an identical machine which you could put in instead of testsparc and give it a go to eliminate this possibility?

Best regards,

Jurij Smakov                                        jurij@wooyd.org
Key: http://www.wooyd.org/pgpkey/                   KeyID: C99E03CC



Reply to: