Re: Libfuse interoperability/ABI broken.
- To: Bernd Schubert <bernd.schubert@fastmail.fm>
- Cc: Ashley Pittman <ashley@pittman.co.uk>, trapexit@spawn.link, "Laszlo Boszormenyi (GCS)" <gcs@debian.org>, debian-devel@lists.debian.org, ubuntu-devel-discuss@lists.ubuntu.com, "fuse-devel@lists.sourceforge.net" <fuse-devel@lists.sourceforge.net>, Miklos Szeredi <miklos@szeredi.hu>, Nikolaus Rath <Nikolaus@rath.org>, Giulio Benetti <giulio.benetti@benettiengineering.com>, Dharmendra Singh <dsingh@ddn.com>
- Subject: Re: Libfuse interoperability/ABI broken.
- From: Amir Goldstein <amir73il@gmail.com>
- Date: Wed, 13 Mar 2024 19:11:39 +0200
- Message-id: <[🔎] CAOQ4uxiz=Xce4_NSixSuW+gHM7MQomfPnmYCQqCZbVtiZX0hRQ@mail.gmail.com>
- In-reply-to: <[🔎] eb5208a4-2b3e-4fa7-bd97-fbe2573d04db@fastmail.fm>
- References: <76E4B0CB-E2B3-4A12-8EAB-868C15599F19@pittman.co.uk> <b1710d2c-9b82-4641-8264-37b0b4fb98b2@fastmail.fm> <ACA7CE7F-3A1D-4E23-90DA-4AB3CA67B8A1@pittman.co.uk> <[🔎] 50f50b57-7f02-46d2-a30d-5da1bf948700@fastmail.fm> <[🔎] CAOQ4uxgFTZfJF11z7MdgFWDn2fvWcAuYgCJKeh+N2C-Wmfpjvg@mail.gmail.com> <[🔎] eb5208a4-2b3e-4fa7-bd97-fbe2573d04db@fastmail.fm>
On Wed, Mar 13, 2024 at 1:57 AM Bernd Schubert
<bernd.schubert@fastmail.fm> wrote:
>
> Hi Amir,
>
> thanks for your help!
>
> On 3/9/24 03:46, Amir Goldstein wrote:
> > On Thu, Mar 7, 2024 at 8:47 PM Bernd Schubert
> > <bernd.schubert@fastmail.fm> wrote:
> >>
> >> Hi all,
> >>
> >> this is certainly not kind of the mail I was hoping for as a new libfuse
> >> maintainer.
> >>
> >> As you can see from the title and from discussion below (sorry this is
> >> not typical ML discussion style), we have a bit of of problem with
> >> libfuse ABI compatibility.
> >>
> >> While scanning through git history, Ashley found a commit that was adding
> >> members in the middle of struct - doesn't break API, but binary
> >> compatibility. That commit already landed a year ago in these releases:
> >>
> >> git tag --contains a5eb7f2a0117ab43119ef5724cf5f4f2f181804a
> >> fuse-3.14.1
> >> fuse-3.15.0
> >> fuse-3.15.1
> >> fuse-3.16.1
> >> fuse-3.16.2
> >>
> >>
> >> Obviously this needs improved testing for, but right now we wonder what
> >> would be the preferred action to avoid issues.
> >>
> >> a) We could fix it and move bits to the right place. Fixes everything
> >> compiled with old versions, but breaks everything compiled with the
> >> versions above
> >>
> >> b) Increase the .so version - enforces recompilation of all binaries.
> >> Intrusive, especially for distributions, but probably safest.
> >>
> >> c) Other ideas?
> >>
> >
> > Heuristically, you can detect most of the shifted flags at runtime
> > because...
> >
> >>
> >>
> >> I don't think there is anything in libfuse that would allow us to
> >> detect which version of libfuse a library was linked to.
> >>
> >
> > I think we do know for sure if fs was linked with libfuse < 3.12
> > without fuse_loop_mt_312?
> > so only left to detect 3.12..3.14 vs. 3.14..3.16
>
> Hmm, I guess I miss something, but how would I know if it was linked
> with fuse_loop_mt_312? That needs an elf reader? Assuming we would put
> this into the library, somehow, how does this work with stripped binaries?
>
> bschubert2@imesrv6 example>nm passthrough_ll | head -n1
> 00000000000003b4 r __abi_tag
> bschubert2@imesrv6 example>strip passthrough_ll
> bschubert2@imesrv6 example>nm passthrough_ll | head -n1
> nm: passthrough_ll: no symbols
>
>
I thought that invoking fuse_loop_mt_*() indicated the lib version
that binary was built with.
I wasn't taking FUSE_USE_VERSION into account.
>
> >
> >>
> >> The commit shifted these members in struct fuse_file_info {
> >>
> >> struct fuse_file_info {
> >> ...
> >> /** Can be filled by open/create, to allow parallel direct writes on this
> >> * file */
> >> unsigned int parallel_direct_writes : 1; --> introduced the shift
> >
> > Not expected in flush/release, so can be heuristically interpreted as flush
> >
> >>
> >> /** Indicates a flush operation. Set in flush operation, also
> >> maybe set in highlevel lock operation and lowlevel release
> >> operation. */
> >> unsigned int flush : 1;
> >>
> >
> > Not expected in open/create, so can be heuristically interpreted as nonseekable
> >
> >> /** Can be filled in by open, to indicate that the file is not
> >> seekable. */
> >> unsigned int nonseekable : 1;
> >>
> >
> > Not expected in release, so can be heuristically interpreted as flock_release
> >
> >> /* Indicates that flock locks for this file should be
> >> released. If set, lock_owner shall contain a valid value.
> >> May only be set in ->release(). */
> >> unsigned int flock_release : 1;
> >>
> >
> > Not expected in opendir, so can be heuristically interpreted as cache_readdir
> >
> >> /** Can be filled in by opendir. It signals the kernel to
> >> enable caching of entries returned by readdir(). Has no
> >> effect when set in other contexts (in particular it does
> >> nothing when set by open()). */
> >> unsigned int cache_readdir : 1;
> >>I am not sure I know how versioned symbols work, but
doesn't new lib always have all the functions
fuse_loop_mt_*() and old binary will be invoking the
No, I was not looking closely.
> >
> > Ignored in open, but based on the comment above, it may be
> > implied that some fs may set it in open() reply
> >
> >> /** Can be filled in by open, to indicate that flush is not needed
> >> on close. */
> >> unsigned int noflush : 1;
> >
> > noflush is just an optimization, which the kernel ignores anyway
> > when writeback cache is enabled, so no harm done if it is wrongly
> > interpreted as readdir_cache in open() and ignored.
> > It is also quite recent (3.11) so not very likely used.
> >
> >> };
> >>
> >> I.e. setting flush would actually set parallel_direct_writes
> >> (note: with binaries compiled against older libfuse versions)
> >>
> >
> > It would be suspicious if fs sets parallel_direct_writes flush at
> > flush() or release(), so that can be used as a hint of old ABI
> > interpreted as flush and issue a warning.
> >
> > It would be pretty ugly to use these heuristics in the library forever,
> > but perhaps as safety measure for binaries built with a range of
> > libfuse version that is not likely to be found in the wild (3.14..3.16)
> > it is an acceptable compromise?
>
> I really like your idea about heuristics, I just don't know how to limit
> the libfuse version range.
>
> >
> > and perhaps the next libfuse version can pass the header version
> > fs was compiled with as argument to fuse_loop_mt_317()?
>
>
> I think adding it to fuse_loop_mt_317 would be easy and could be
> auto-added with a macro. I think more difficult is to add it to the old
> api/abi. I.e. it would take a few years until that api version gets
> mostly used?
>
I guess we would add __fuse_loop_mt_*() that take an auto
lib version argument.
Old binaries are old binaries.
But binaries rebuilt with new lib will always end up calling
__fuse_loop_mt_*() functions with the lib version. No?
Thanks,
Amir.
Reply to: