Bug#39005: Mail loss over NFS with Kernel 2.2.*
On Sat, 05 Jun 1999, I wrote:
> Package: kernel-image-2.2.9
> Version: 2.2.9-1
> I open /var/spool/mail/roland with Mutt and log into the server using
> ssh at the same time. Then I write a mail on the server using mail
> which is locally delivered using sendmail 8.9.3-3 and procmail
> 3.13.1-1 (compiled with dotlocking).
> After this mail is delivered I synchronize the mailfolder using mutt's
> sync-mailbox command (bound on $). After this the test mail is lost.
I did some more testing on this problem now with Kernel 2.2.11. The
problem is still there but now I know what happens :-)
The problem is, that 2.2.11 does not only cache attributes via NFS but
also the files themselves (2.0.* seems not to cache files). This
means, that the client (running mutt) sometimes doesn't notice that
the server (running procmail) cannot notice that the mailbox, which
mutt is writing, was just changed by procmail.
This is a _very_ ugly situation, so there's a trick to work it around:
When fcntl() is used for locking a file, Kernel 2.2.11 will remove the
locked file from the file cache. So mutt is able to find out, that
procmail just changed the folder. The idea is good, but there are
some problems with it:
1. Per default this works only, if the server supports locking (like
knfsd is told to be). If the server still uses the user space
nfs-server package, a patch is needed on the client. I attached
this patch, which is written by Olaf Kirch, to this message.
2. Every program accessing the mail spool (via NFS) has to use
dotlocking in combination with fcntl(). As far as I know at least
liblockfile, procmail and mutt use only dot locking, but no
fcntl(). liblockfile does so, and policy tells us, that the
behavior of liblockfile is the one other programs should
implement...
I sent a copy of this message to debian-devel, because I think, that
this means, that all mail accessing programs should be changed to use
fcntl() in addition to dotlock! Maybe I should file bug reports
against all mail locking applications?
In addition to this I think, that the Debian kernel-images should be
patched with the attached patch as well as with the 4 patches from the
knfs package (don't ask me why they aren't in the upstream kernel
until 2.2.11).
Ciao
Roland
--
* roland@spinnaker.de * http://www.spinnaker.de/ *
PGP: 1024/DD08DD6D 2D E7 CC DE D5 8D 78 BE 3C A0 A4 F1 4B 09 CE AF
--- fs/nfs/file.c.org Tue Jun 1 13:09:01 1999
+++ fs/nfs/file.c Thu Aug 19 22:35:33 1999
@@ -214,7 +214,7 @@
/* Fake OK code if mounted without NLM support */
if (NFS_SERVER(inode)->flags & NFS_MOUNT_NONLM)
- return 0;
+ /* return 0; */ goto out_okay;
/*
* No BSD flocks over NFS allowed.
@@ -241,6 +241,7 @@
* Make sure we re-validate anything we've got cached.
* This makes locking act as a cache coherency point.
*/
+out_okay:
NFS_CACHEINV(inode);
return 0;
}
Reply to: