[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: golang-github-katalix-go-l2tp



On  Wed, Feb 14, 2024 at 21:08:38 +0000, Tom Parkin wrote:
> On  Wed, Feb 14, 2024 at 18:44:54 +0100, Simon Josefsson wrote:
> > Tom Parkin <tparkin@katalix.com> writes:
> > 
> > > On  Tue, Jan 23, 2024 at 18:05:23 +0100, Simon Josefsson wrote:
> > >> Tom Parkin <tparkin@katalix.com> writes:
> > >> 
> > >> > Hi Simon,
> > >> >
> > >> > On  Mon, Jan 22, 2024 at 20:15:11 +0100, Simon Josefsson wrote:
> > >> >> golang-github-katalix-go-l2tp
> > >> >> https://salsa.debian.org/jas/golang-google-grpc/-/jobs/5191076
> > >> >> === RUN   TestBasicSendReceive/5:_send/recv_[::1]:9000_[::1]:9001_L2TPv3_IP
> > >> >> level=info function=transport message=retransmit message_type=avpMsgTypeHello
> > >> >> level=info function=transport message=retransmit message_type=avpMsgTypeHello
> > >> >> level=info function=transport message=retransmit message_type=avpMsgTypeHello
> > >> >> level=error function=transport message="socket read failed" error="resource temporarily unavailable"
> > >> >> level=error function=transport message="transport down" error="transmit of avpMsgTypeHello failed after 3 retry attempts"
> > >> >>     transport_test.go:388: test sender function reported an error:
> > >> >> failed to send Hello message: transmit of avpMsgTypeHello failed
> > >> >> after 3 retry attempts
> > >> >> panic: test timed out after 10m0s
> > >> >
> > >> > This test is failing to send a packet over an IPv6 L2TPIP socket: it
> > >> > will depend on the go runtime support for L2TPIP (which has been in
> > >> > for ages), and also the kernel having the l2tp_ipv6 driver loaded.
> > >> >
> > >> > I'd sort of expect to see messages along those lines when trying to
> > >> > open the socket, though, rather than tx/rx failing :-/
> > >> >
> > >> > I'm not at all familiar with the environment of the Salsa test
> > >> > pipeline -- could you expand on what the configuration is here?
> > >> 
> > >> Thanks for looking at the logs Tom.  I don't really know much about the
> > >> environment except for these pointers:
> > >> 
> > >> https://wiki.debian.org/Salsa/Doc#Runners
> > >> https://salsa.debian.org/salsa-ci-team/pipeline/
> > >> 
> > >> Does it setup a server on ::1 properly?  Any outbound connections?  Only
> > >> http(s) is allowed.
> > >
> > > So I *think* the runtime env is a VM using the "Google Container-Optimized
> > > OS".  The fact that the socket opens successfully but the packet is
> > > apparently lost is suggestive of some kind of firewalling.  I'll see
> > > if I can figure anything out from the Google docs.
> > >
> > > The tests work OK when run manually and when run as part of the
> > > package build here, so I think it must be something specific to the
> > > pipeline VM but I'm not sure what at the moment.
> > >
> > > In terms of the test configuration, it basically opens a socket for
> > > each end of the connection and verifies it can send/receive over those
> > > sockets.  It's the same test code for each configuration.
> > 
> > This problem happens for others:
> > 
> > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1063746
> > 
> > Interestingly the failure seems arch-specific:
> > 
> > https://ci.debian.net/packages/g/golang-github-katalix-go-l2tp/
> 
> Interesting -- thank you for the further information.  The fact it
> seems arch-specific is striking as you say, but it's odd that amd64 is
> failing since that's what the code has been developed on.
> 
> If I can reproduce it in a sid chroot that'll be a good starting point
> I think.  I will try this and see if I can get any more information.
> 
> I've unfortunately not had time to dig further into the Google VM
> docs; so a way to reproduce it outside that environment would be most
> welcome.
> 
> > It could still well be that something in salsa and debci VM, and the
> > #1063746 reporter's machine, that is causing this -- but it seems this
> > clearly happens often enough, and is causing build failures checking
> > reverse dependencies of several packages going into experimental, so it
> > would be nice to fix it.  Do you have any ideas?  Could some test be
> > disabled or silenced somehow?  I'm ignoring build failures in
> > golang-github-katalix-go-l2tp meanwhile.
> 
> Possibly the test could be skipped if we could figure out the root
> cause.
> 
> I did find when working on Fedora packaging that F38 had a strange
> issue whereby the l2tp_ip kernel module was blacklisted, which would
> cause the first IP encap test to fail.  Strangely the l2tp_ip6 module
> is not blacklisted, so on the second time around the IP encap test
> would pass as the l2tp_ip6 module would autoload l2tp_ip as the former
> depends on the latter.
> 
> There's a fix in go-l2tp upstream for this issue, I'm not sure whether
> something similar might apply here.  If I can reproduce the issue I'll
> see whether a workaround can be applied.

Hi Simon,

Just a quick update from me on the go-l2tp FTBFS.

I've managed to isolate the problem now to a bad kernel commit
upstream which has unfortunately been backported to (at least) the
stable linux-6.1.y which Debian is using.

I will work on a kernel fix which will hopefully eventually make its
way to distro kernels, but in the meantime I will have to patch the
specific test case out of go-l2tp.

Thanks,
Tom
-- 
Tom Parkin
Katalix Systems Ltd
https://katalix.com
Catalysts for your Embedded Linux software development

Attachment: signature.asc
Description: PGP signature


Reply to: