[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[release-critical] NFS locking, policy, fcntl(), and liblockfile.



[Please CC me on any replies, since I'm _not_ subscribed to
debian-devel.  I'm writing this as the Mutt upstream maintainer.]


The Bug Reports
---------------

> Package: mutt (main)
> Maintainer: Marco d'Itri <md@linux.it>
>   39030  bugs.debian.org: heavy problems with potato: loosing mail!
> [HELP] Anyone who knows NFS?

(Before going into the details, let me note that mutt can't do
anything about this.  See below.)

> Package: nfs-common (main)
> Maintainer: Anders Hammarquist <iko@debian.org>
>   43448  nfs-common: --enable-secure-statd breaks fcntl() locking

Additionally, #43491 (against liblockfile1; this has been
re-classified to "wishlist" for reasons incomprehensible to me)
should be taken into consideration.


The Problem
-----------

The problem exhibited in #39030 is that, before synchronizing a mail
folder, mutt has to check whether there were any changes to the
folder which are not known to mutt's internal state.  In order to do
this, the folder is locked (according to the locking method of the
day, usually some combination of fcntl(), flock(), and dotlocking),
stated, read partially.  Now, all these methods to detect new mail
failed.  Even some attempts to detect modifications by doing a
read() beyond what stat() tells us is the end of the file failed.

Some discussions with various people on nfs-devel@kernel.org,
mutt-dev@mutt.org, and a couple of Debian maintainers involved,
revealed the "true cause" for this problem: Lack of NFS client cache
invalidation.  The preferred (and only) method to invalidate NFS
caches is to fcntl()-lock the file in question.  Assuming that this
works, mutt's problematic behaviour should go away.

While cache invalidation actually depends on the presence of NFS
locking support on the server in Linux 2.2 up to and including
2.2.11, there is a patch from Olaf Titz for this, so we can consider
this part of the problem fixed.  (When only potato systems are
involved, fixing #43448 should also make that part of the problem go
away.)

BUT the worse problem is current Debian policy: As far as I can see,
the only prescribed locking method for mail folders is dotlocking.
This does NOT lead to ANY NFS cache invalidation.


The Solution
------------

Thus, a policy change and a corresponding change to liblockfile and
possibly various MUAs and MDAs is required.

Most notably, POSIX file locking MUST be used by any MDA or MUA.
There is no way around this.  Taking possibly broken NFS servers
into account, one may also wish to keep the currently prescribed
dotlocking.  Now, using different locking methods may lead to
deadlocks and other problem when done without thought and without a
well-defined order, as correctly noted by Miquel van Smoorenburg.

Anyway, many MDAs/MUAs do _not_ have a problem with this. More
precisely, the locking is done non-blocking.  That is, there is only
a finite number of attempts to do dotlocking, and fcntl() is used
with F_SETLK, and NOT with F_SETLKW.  When one of the locking
methods fails, _all_ locks acquired so far have to be given up, and
the process should wait for a (possibly random) amount of time.

Sample code:

------------------------------

int do_lock (const char *path, int fd, int retries)
{
	int i;
	int rv;	

	for (i = 0; i < retries; i++)
	{
		if ((rv = do_fcntl_lock (fd)) == -1 && errno != EAGAIN)
			return -1;		
		else if (rv == 0)
		{
			if (do_dotlock (path) == 0)
				return 0;
			do_fcntl_unlock (fd);
		}
		sleep (rand() % 10);
	}
	return -1;
}

------------------------------

This variant would be more or less optimal for Debian's purposes.
Most important, it permits to graciously handle deadlock situations
by giving other programs a chance to do get the complete lock, and
do their work.

Variants of this include multiple, separate retries for each locking
method:

------------------------------

int do_lock (const char *path, int fd, int retries)
{
	int i;
	int rv = -1;

#ifdef USE_FCNTL	
	for (i = 0; i < retries; i++)
	{	
		if ((rv = do_fcntl_lock (fd)) == -1 && errno != EAGAIN)
			return -1;
		else if (rv == 0)
			break;
		
		sleep (rand() % 10);
	}		

	if (rv != 0) return -1;
#endif

#ifdef USE_DOTLOCK
	for (i = 0; i < retries; i++)
	{	
		if ((rv = do_dotlock (path)) == 0)
			return 0;

		sleep (rand() % 10);
	}

#  ifdef USE_FCNTL
	if (rv != 0)
		do_fcntl_unlock (fd);
#  endif
#endif

	return rv;
}

------------------------------

While the latter method is less elegant and will actually _fail_ in
certain deadlock situations, it _may_ be better suited for programs
which can use a configurable set of locking methods.

I leave the do_fcntl_lock() and do_dotlock() functions as an
exercise to the reader.


Note that MUAs and MDAs MUST NOT lose any data, or bounce messages,
when a lock fails with any of the abovementioned strategies.
Instead, they should "defer" the operation in question.  For an MDA,
this should mean to exit with a return code of EX_TEMPFAIL; for a
MUA, this should mean to give an error message to the user, and let
him retry later.



Even though I am not a member of the Debian project, but just an
interested Debian user and upstream maintainer concerned by a
"release-critical bug", I'd like to propose that the Debian policy
should prescribe the use of the abovementioned methods.  Liblockfile
should implement this.  Additional, MUA and MDA package maintainers
will have to check their packages for conformance with this locking
scheme.

Thanks for listening.



Reply to: