Re: Libfuse interoperability/ABI broken.
- To: Amir Goldstein <amir73il@gmail.com>
- Cc: Ashley Pittman <ashley@pittman.co.uk>, trapexit@spawn.link, "Laszlo Boszormenyi (GCS)" <gcs@debian.org>, debian-devel@lists.debian.org, ubuntu-devel-discuss@lists.ubuntu.com, "fuse-devel@lists.sourceforge.net" <fuse-devel@lists.sourceforge.net>, Miklos Szeredi <miklos@szeredi.hu>, Nikolaus Rath <Nikolaus@rath.org>, Giulio Benetti <giulio.benetti@benettiengineering.com>, Dharmendra Singh <dsingh@ddn.com>
- Subject: Re: Libfuse interoperability/ABI broken.
- From: Bernd Schubert <bernd.schubert@fastmail.fm>
- Date: Wed, 13 Mar 2024 00:57:48 +0100
- Message-id: <[🔎] eb5208a4-2b3e-4fa7-bd97-fbe2573d04db@fastmail.fm>
- In-reply-to: <[🔎] CAOQ4uxgFTZfJF11z7MdgFWDn2fvWcAuYgCJKeh+N2C-Wmfpjvg@mail.gmail.com>
- References: <76E4B0CB-E2B3-4A12-8EAB-868C15599F19@pittman.co.uk> <b1710d2c-9b82-4641-8264-37b0b4fb98b2@fastmail.fm> <ACA7CE7F-3A1D-4E23-90DA-4AB3CA67B8A1@pittman.co.uk> <[🔎] 50f50b57-7f02-46d2-a30d-5da1bf948700@fastmail.fm> <[🔎] CAOQ4uxgFTZfJF11z7MdgFWDn2fvWcAuYgCJKeh+N2C-Wmfpjvg@mail.gmail.com>
Hi Amir,
thanks for your help!
On 3/9/24 03:46, Amir Goldstein wrote:
> On Thu, Mar 7, 2024 at 8:47 PM Bernd Schubert
> <bernd.schubert@fastmail.fm> wrote:
>>
>> Hi all,
>>
>> this is certainly not kind of the mail I was hoping for as a new libfuse
>> maintainer.
>>
>> As you can see from the title and from discussion below (sorry this is
>> not typical ML discussion style), we have a bit of of problem with
>> libfuse ABI compatibility.
>>
>> While scanning through git history, Ashley found a commit that was adding
>> members in the middle of struct - doesn't break API, but binary
>> compatibility. That commit already landed a year ago in these releases:
>>
>> git tag --contains a5eb7f2a0117ab43119ef5724cf5f4f2f181804a
>> fuse-3.14.1
>> fuse-3.15.0
>> fuse-3.15.1
>> fuse-3.16.1
>> fuse-3.16.2
>>
>>
>> Obviously this needs improved testing for, but right now we wonder what
>> would be the preferred action to avoid issues.
>>
>> a) We could fix it and move bits to the right place. Fixes everything
>> compiled with old versions, but breaks everything compiled with the
>> versions above
>>
>> b) Increase the .so version - enforces recompilation of all binaries.
>> Intrusive, especially for distributions, but probably safest.
>>
>> c) Other ideas?
>>
>
> Heuristically, you can detect most of the shifted flags at runtime
> because...
>
>>
>>
>> I don't think there is anything in libfuse that would allow us to
>> detect which version of libfuse a library was linked to.
>>
>
> I think we do know for sure if fs was linked with libfuse < 3.12
> without fuse_loop_mt_312?
> so only left to detect 3.12..3.14 vs. 3.14..3.16
Hmm, I guess I miss something, but how would I know if it was linked
with fuse_loop_mt_312? That needs an elf reader? Assuming we would put
this into the library, somehow, how does this work with stripped binaries?
bschubert2@imesrv6 example>nm passthrough_ll | head -n1
00000000000003b4 r __abi_tag
bschubert2@imesrv6 example>strip passthrough_ll
bschubert2@imesrv6 example>nm passthrough_ll | head -n1
nm: passthrough_ll: no symbols
>
>>
>> The commit shifted these members in struct fuse_file_info {
>>
>> struct fuse_file_info {
>> ...
>> /** Can be filled by open/create, to allow parallel direct writes on this
>> * file */
>> unsigned int parallel_direct_writes : 1; --> introduced the shift
>
> Not expected in flush/release, so can be heuristically interpreted as flush
>
>>
>> /** Indicates a flush operation. Set in flush operation, also
>> maybe set in highlevel lock operation and lowlevel release
>> operation. */
>> unsigned int flush : 1;
>>
>
> Not expected in open/create, so can be heuristically interpreted as nonseekable
>
>> /** Can be filled in by open, to indicate that the file is not
>> seekable. */
>> unsigned int nonseekable : 1;
>>
>
> Not expected in release, so can be heuristically interpreted as flock_release
>
>> /* Indicates that flock locks for this file should be
>> released. If set, lock_owner shall contain a valid value.
>> May only be set in ->release(). */
>> unsigned int flock_release : 1;
>>
>
> Not expected in opendir, so can be heuristically interpreted as cache_readdir
>
>> /** Can be filled in by opendir. It signals the kernel to
>> enable caching of entries returned by readdir(). Has no
>> effect when set in other contexts (in particular it does
>> nothing when set by open()). */
>> unsigned int cache_readdir : 1;
>>
>
> Ignored in open, but based on the comment above, it may be
> implied that some fs may set it in open() reply
>
>> /** Can be filled in by open, to indicate that flush is not needed
>> on close. */
>> unsigned int noflush : 1;
>
> noflush is just an optimization, which the kernel ignores anyway
> when writeback cache is enabled, so no harm done if it is wrongly
> interpreted as readdir_cache in open() and ignored.
> It is also quite recent (3.11) so not very likely used.
>
>> };
>>
>> I.e. setting flush would actually set parallel_direct_writes
>> (note: with binaries compiled against older libfuse versions)
>>
>
> It would be suspicious if fs sets parallel_direct_writes flush at
> flush() or release(), so that can be used as a hint of old ABI
> interpreted as flush and issue a warning.
>
> It would be pretty ugly to use these heuristics in the library forever,
> but perhaps as safety measure for binaries built with a range of
> libfuse version that is not likely to be found in the wild (3.14..3.16)
> it is an acceptable compromise?
I really like your idea about heuristics, I just don't know how to limit
the libfuse version range.
>
> and perhaps the next libfuse version can pass the header version
> fs was compiled with as argument to fuse_loop_mt_317()?
I think adding it to fuse_loop_mt_317 would be easy and could be
auto-added with a macro. I think more difficult is to add it to the old
api/abi. I.e. it would take a few years until that api version gets
mostly used?
Reply to: