Re: Bug#964541: flatpak: Wrong argument order for clone syscall seccomp filter on s390x
On Wed, 05 Aug 2020 at 16:28:50 +0200, Julian Andres Klode wrote:
> clone() is a mad syscall with about 4 different argument orders. While
> most of them agree that argument 0 is flags, s390 and s390x have the
> flags argument second - A0 is the child stack pointer there.
It looks as though testing __s390x__ is redundant, because s390x compilers
also define __s390__, but perhaps it's safer to do so.
bubblewrap (without which flatpak won't work) uses:
#if defined(__s390__) || defined(__CRIS__)
return (int) syscall (__NR_clone, child_stack, flags);
return (int) syscall (__NR_clone, flags, child_stack);
so I think for completeness we should also be testing __CRIS__. The
clone(2) man page confirms this.
might also be a useful reference here: it has the special case you suggest
on __s390x__, __s390__ and __CRIS__, and additionally has a separate
implementation for sparc in assembler.
The order of the parent TID, child TID and TLS is not relevant for
bubblewrap because it doesn't pass the flags that would make the kernel
look at those arguments.
According to codesearch.debian.net these locations might need a
The clone(2) man page notes that blackfin, m68k and sparc have a different
calling convention again, so bubblewrap probably doesn't work there, and
we should perhaps force it to FTBFS on those architectures until/unless
someone contributes and tests a code path for raw_clone() for them; we
don't detect this problem in build-time tests because Debian kernels,
and in particular buildds, don't allow unprivileged users to create new
user namespaces without a setuid-root executable being involved, so the
just-built copy of bubblewrap won't work anyway. In practice the major
users of bubblewrap all want a seccomp filter, and libseccomp FTBFS on
m68k and sparc64, so I don't think also removing bubblewrap would have
a significant impact.
Alternatively, someone could try refactoring bubblewrap.c so that
it can use the glibc clone() wrapper, but that's a very significant
change, because the underlying syscall has a fork()-like interface
(and bubblewrap assumes that) but the glibc clone() wrapper makes it
look more like pthread_create() - so all the information that's used by
the child would have to be bundled into a struct or turned into global
variables. That seems like definitely something to do upstream rather
than downstream, and might be a change that would be rejected outright
if it hurts the clarity of what is already rather subtle code.
Another possible route would be to steal raw-clone.h from systemd -
bubblewrap's current raw_clone() takes a child_stack argument, while
systemd's raw_clone() doesn't take the child_stack argument and assumes
NULL, but bubblewrap uses NULL anyway, so that doesn't really matter.