Transitioning from existence-based lockfiles and /var/lock to flock
Based on #1115317, the Technical Committee called for a transition on
locking, and an update to Policy. In general, Policy updates tend to be
lagging indicators, updating after most packages are already compliant.
In this case, most packages already *do* use flock-based locking, rather
than existence-based lockfiles, so I think it'd be reasonable to capture
this in Policy (as a "should" initially, making it a *normal* bug to do
otherwise) *concurrently* with porting the remaining software over to
flock. It could then be raised to a "must" when almost all software has
moved over.
I'd be happy to put together an initial Policy draft. I'd like to get
rough consensus on a rough sketch first. Most of this would go in a new
section on lockfiles, except for the last point.
- Software should not use existence-based lockfiles (where the existence
of the lockfile constitutes holding the lock); software should use
file-based locking (`flock`) on an appropriate file instead.
- Where possible, software should apply `flock` to an appropriate target
file rather than a dedicated lockfile. For instance, if locking a
device or a data file, software should `flock` the device file or data
file, rather than creating a separate file to lock.
- `/var/lock` and `/run/lock` should only be used by software designed
to run as root, and should only be used if there is not an appropriate
target file to lock instead. Even when using `/var/lock` or
`/run/lock`, the lockfile should be handled using `flock`, not an
existence-based lock.
- As a transitional measure, when lockfiles are used for coordination
across multiple pieces of software, and that software has historically
used existence-based lockfiles or (as non-root) lockfiles in
`/var/lock` or `/run/lock`, software should *both* `flock` an
appropriate target file or non-existence-based lockfile *and* attempt
to lock the traditional lock. To avoid deadlocks and to allow
detection of un-transitioned software, the locks should be acquired in
that order and dropped in reverse order. For the traditional lock, the
software should distinguish between:
- Failing to acquire the lock because another process has it, which it
should treat as being locked and act accordingly, dropping the
flock, and warning that either the lock is stale or the process
owning the lock has not yet been transitioned to `flock`.
- Failing to acquire the lock for some other reason, such as
a permission error (e.g. inability to write to `/var/lock`), which
it should silently ignore, and assume the system has been
transitioned to use `flock` exclusively.
- After an appropriately coordinated transitional period for that
cooperating group of software, the software may drop the acquisition
of the traditional lock, and use `flock` exclusively.
- (In "9.1.1 File System Structure") 14. See "Lockfiles"
(cross-reference to the above section) for details on handling
lockfiles, rather than defaulting to the use of `/var/lock` or
`/run/lock`.
Reply to: