Re: Transitioning from existence-based lockfiles and /var/lock to flock
On Mon, 13 Oct 2025 at 11:04:05 -0700, Josh Triplett wrote:
- Software should not use existence-based lockfiles (where the existence
of the lockfile constitutes holding the lock); software should use
file-based locking (`flock`) on an appropriate file instead.
There are several orthogonal advisory lock mechanisms, and I don't think
Policy should take a general position on which one should be used, as
long as all programs that might want to exclude each other by holding a
lock can agree on which one they are going to use. The ones I know about
are:
* flock(2) (POSIX) and its command-line interface flock(1) (util-linux)
* fcntl F_SETLK and friends (POSIX)
* fcntl F_OFD_SETLCK and friends (Linux-specific)
* lockf(3) (POSIX)
- which wraps one of the fcntl locks on GNU/Linux, but might be something
else on other kernel/libc combinations
There might be others.
In general it would be a significant bug to replace one of these with
another of these without domain-specific attention being paid to the
subtleties of their semantics in terms of which ones exclude each other,
which ones can be inherited from parent to child, which ones are scoped
to a process or a thread or an open file description, and so on.
For example, when Flatpak wants to prevent a concurrent Flatpak process
from deleting a runtime that is in use by an app, it implements that by
locking the file ${runtime}/.ref with fcntl F_SETLK. It would be fine
for a program that interacts with Flatpak (or a newer version of Flatpak
itself) to use either F_SETLK or F_OFD_SETLCK on ${runtime}/.ref,
because F_OFD_SETLCK is documented to be mutually exclusive with an
incompatible F_SETLK, but it would be a potentialy serious bug for it
to use flock(2), because it is unspecified whether flock(2) and F_SETLK
exclude each other (and on Linux they don't, unless NFS happens to be
involved).
Similarly, it would be a potentially serious bug if one program locked
the file ${runtime}/.ref, but another took out a lock on the directory
itself, ${runtime}, intending to exclude the other program. Either one
of those two locking disciplines is OK in isolation, but the two
programs must agree on which one they are going to use. Clusters of
closely-cooperating programs can just agree this among themselves
without any special coordination and without any Policy involvement, but
broader or looser categories of programs could benefit from coordination
in Policy.
In particular:
Policy does specify (in §11.6) how to lock the mailboxes in /var/mail/,
because that is an example of a single domain-specific context where
it's necessary that everything agrees. (It already calls for this to be
done inside /var/mail/ rather than involving /var/lock/ or /run/lock/,
so it's out-of-scope for #1115317.)
According to #1110980 and #1110981, the FHS, which Policy incorporates
by reference, specifies the use of lock files in /var/lock/ for serial
ports. If we want programs like uucp to prefer to use flock or fcntl
locks for this purpose, then we will need to document a FHS exception in
Policy for this, and specify which of the various advisory locking
mechanisms is to be used for it - preferably one that is already
supported in software that locks serial ports, or already used in other
distros. In #1110980, Luca recommended "BSD locks" and mentions that
some serial-port-related software already supports those, but I'm not
sure which specific API that was intended to refer to - as approximately
POSIX-compliant OS distributions, the BSDs presumably support both
flock(2) and fcntl F_SETLK, and possibly others.
I think it would be best to have a specific, narrowly-scoped bug to
agree on how programs like uucp should lock serial ports, with its
conclusion documented in Policy. I don't know whether there are other
non-closely-cooperating groups of programs currently using /run/lock/ or
(equivalently) /var/lock/ that need similar
coordination.
smcv
Reply to: