Bug#855632: Bug#883217: linux: open on NFSv4 exported file on nfs server: "Resource temporarily unavailable" under reproducible conditions when client has granted read delegation on file
- To: Stephen Dowdy <sdowdy@ucar.edu>, 883217@bugs.debian.org
- Cc: 855632@bugs.debian.org
- Subject: Bug#855632: Bug#883217: linux: open on NFSv4 exported file on nfs server: "Resource temporarily unavailable" under reproducible conditions when client has granted read delegation on file
- From: Salvatore Bonaccorso <carnil@debian.org>
- Date: Mon, 4 Dec 2017 21:24:55 +0100
- Message-id: <[🔎] 20171204202455.GA15666@eldamar.local>
- Reply-to: Salvatore Bonaccorso <carnil@debian.org>, 855632@bugs.debian.org
- In-reply-to: <9c8a3698-9963-d207-97ba-f9978ab206c1@ucar.edu>
- References: <151207435082.19772.685743359062788785.reportbug@eldamar.local> <151207435082.19772.685743359062788785.reportbug@eldamar.local> <9c8a3698-9963-d207-97ba-f9978ab206c1@ucar.edu> <20170220210934.35230.5117.reportbug@gizmo.rap.ucar.edu>
Hi
On Thu, Nov 30, 2017 at 03:35:40PM -0700, Stephen Dowdy wrote:
> On 11/30/2017 01:39 PM, Salvatore Bonaccorso wrote:
> > Is this worth trying to be fixed for the jessie kernel?
>
> Salvatore,
>
> I believe this is likely the reason for my bug report:
>
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=855632
>
> as that system has thrown EAGAIN errors since i installed it in April, 2015.
> It's a 10 NIC NFS server for the department, and often throws the error when i update files that are likely being read/open by client systems.
> (it doesn't have a huge resource consumption load ever and i get that failure)
>
> So, i vote yeah ;)
Okay.
I tried to track that further down, and attached
0001-locks-remove-i_have_this_lease-check-from-__break_le.patch
0002-locks-__break_lease-cleanup-in-preparation-of-allowi.patch
to be applied on top of the current jessie branch in git.
Attached are the two individual patches:
locks-remove-i_have_this_lease-check-from-__break_le.patch
locks-__break_lease-cleanup-in-preparation-of-allowi.patch
With these two patches applied I was not able to reproduce the problem
now for a while, whereas previously it was relatively fast
triggerable.
Can you confirm the issue would be addressed as well for you?
See the kernel-handbook for the simple-patching guideline:
https://kernel-handbook.alioth.debian.org/ch-common-tasks.html#s4.2.2
Still since the patches were integrated in a bigger rewrite/touching
of fs/locks.c, fs/nfsd this might need a proper/deeper review if that
is complete and does not break anything.
Regards,
Salvatore
>From 6997c3a97579e46cb839c334b4b9b6f96c3b573b Mon Sep 17 00:00:00 2001
From: Salvatore Bonaccorso <carnil@debian.org>
Date: Mon, 4 Dec 2017 11:11:28 +0100
Subject: [PATCH 1/2] locks: remove i_have_this_lease check from __break_lease
---
debian/changelog | 6 +++
...e-i_have_this_lease-check-from-__break_le.patch | 54 ++++++++++++++++++++++
debian/patches/series | 1 +
3 files changed, 61 insertions(+)
create mode 100644 debian/patches/bugfix/all/locks-remove-i_have_this_lease-check-from-__break_le.patch
diff --git a/debian/changelog b/debian/changelog
index 977e1cea3..955b86f56 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+linux (3.16.51-3) UNRELEASED; urgency=medium
+
+ * locks: remove i_have_this_lease check from __break_lease
+
+ -- Salvatore Bonaccorso <carnil@debian.org> Mon, 04 Dec 2017 12:17:53 +0100
+
linux (3.16.51-2) jessie; urgency=medium
* [mips*] inst: Avoid ABI change in 3.16.51
diff --git a/debian/patches/bugfix/all/locks-remove-i_have_this_lease-check-from-__break_le.patch b/debian/patches/bugfix/all/locks-remove-i_have_this_lease-check-from-__break_le.patch
new file mode 100644
index 000000000..04a778b40
--- /dev/null
+++ b/debian/patches/bugfix/all/locks-remove-i_have_this_lease-check-from-__break_le.patch
@@ -0,0 +1,54 @@
+From: Jeff Layton <jlayton@primarydata.com>
+Date: Mon, 1 Sep 2014 14:27:43 -0400
+Subject: locks: remove i_have_this_lease check from __break_lease
+Origin: https://git.kernel.org/linus/843c6b2f4cef384af8e0de6b7ac7191675030e3a
+
+I think that the intent of this code was to ensure that a process won't
+deadlock if it has one fd open with a lease on it and then breaks that
+lease by opening another fd. In that case it'll treat the __break_lease
+call as if it were non-blocking.
+
+This seems wrong -- the process could (for instance) be multithreaded
+and managing different fds via different threads. I also don't see any
+mention of this limitation in the (somewhat sketchy) documentation.
+
+Remove the check and the non-blocking behavior when i_have_this_lease
+is true.
+
+Signed-off-by: Jeff Layton <jlayton@primarydata.com>
+[carnil: Backport for 3.16:
+ - adjust context
+]
+---
+ fs/locks.c | 6 ++----
+ 1 file changed, 2 insertions(+), 4 deletions(-)
+
+--- a/fs/locks.c
++++ b/fs/locks.c
+@@ -1326,7 +1326,6 @@ int __break_lease(struct inode *inode, u
+ struct file_lock *new_fl, *flock;
+ struct file_lock *fl;
+ unsigned long break_time;
+- int i_have_this_lease = 0;
+ bool lease_conflict = false;
+ int want_write = (mode & O_ACCMODE) != O_RDONLY;
+
+@@ -1346,8 +1345,7 @@ int __break_lease(struct inode *inode, u
+ for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) {
+ if (leases_conflict(fl, new_fl)) {
+ lease_conflict = true;
+- if (fl->fl_owner == current->files)
+- i_have_this_lease = 1;
++ break;
+ }
+ }
+ if (!lease_conflict)
+@@ -1377,7 +1375,7 @@ int __break_lease(struct inode *inode, u
+ fl->fl_lmops->lm_break(fl);
+ }
+
+- if (i_have_this_lease || (mode & O_NONBLOCK)) {
++ if (mode & O_NONBLOCK) {
+ trace_break_lease_noblock(inode, new_fl);
+ error = -EWOULDBLOCK;
+ goto out;
diff --git a/debian/patches/series b/debian/patches/series
index 4cd4a739c..4ab96adb2 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -251,6 +251,7 @@ bugfix/all/vfs-avoid-creation-of-inode-number-0-in-get_next_ino.patch
bugfix/all/mm-mmap.c-expand_downwards-don-t-require-the-gap-if-.patch
bugfix/x86/mmap-remember-the-map_fixed-flag-as-vm_fixed.patch
bugfix/x86/mmap-add-an-exception-to-the-stack-gap-for-hotspot-jvm.patch
+bugfix/all/locks-remove-i_have_this_lease-check-from-__break_le.patch
# memfd_create() & kdbus backport
features/all/kdbus/mm-allow-drivers-to-prevent-new-writable-mappings.patch
--
2.15.1
>From 92b1365d214c86a4c5eeb25c131dfe0255585ca0 Mon Sep 17 00:00:00 2001
From: Salvatore Bonaccorso <carnil@debian.org>
Date: Mon, 4 Dec 2017 11:16:30 +0100
Subject: [PATCH 2/2] locks: __break_lease cleanup in preparation of allowing
direct removal of leases
Closes: #883217
---
debian/changelog | 2 +
...ak_lease-cleanup-in-preparation-of-allowi.patch | 130 +++++++++++++++++++++
debian/patches/series | 1 +
3 files changed, 133 insertions(+)
create mode 100644 debian/patches/bugfix/all/locks-__break_lease-cleanup-in-preparation-of-allowi.patch
diff --git a/debian/changelog b/debian/changelog
index 955b86f56..7ef7d931b 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,6 +1,8 @@
linux (3.16.51-3) UNRELEASED; urgency=medium
* locks: remove i_have_this_lease check from __break_lease
+ * locks: __break_lease cleanup in preparation of allowing direct removal of
+ leases (Closes: #883217)
-- Salvatore Bonaccorso <carnil@debian.org> Mon, 04 Dec 2017 12:17:53 +0100
diff --git a/debian/patches/bugfix/all/locks-__break_lease-cleanup-in-preparation-of-allowi.patch b/debian/patches/bugfix/all/locks-__break_lease-cleanup-in-preparation-of-allowi.patch
new file mode 100644
index 000000000..b52e876fa
--- /dev/null
+++ b/debian/patches/bugfix/all/locks-__break_lease-cleanup-in-preparation-of-allowi.patch
@@ -0,0 +1,130 @@
+From: Jeff Layton <jlayton@primarydata.com>
+Date: Mon, 1 Sep 2014 14:53:41 -0400
+Subject: locks: __break_lease cleanup in preparation of allowing direct
+ removal of leases
+Origin: https://git.kernel.org/linus/03d12ddf845a4eb874ffa558d65a548aee9b715b
+Bug-Debian: https://bugs.debian.org/883217
+
+Eliminate an unneeded "flock" variable. We can use "fl" as a loop cursor
+everywhere. Add a any_leases_conflict helper function as well to
+consolidate a bit of code.
+
+Signed-off-by: Jeff Layton <jlayton@primarydata.com>
+Reviewed-by: Christoph Hellwig <hch@lst.de>
+[carnil: backport for 3.16:
+ - adjust context
+]
+---
+ fs/locks.c | 49 +++++++++++++++++++++++++------------------------
+ 1 file changed, 25 insertions(+), 24 deletions(-)
+
+--- a/fs/locks.c
++++ b/fs/locks.c
+@@ -1307,6 +1307,20 @@ static bool leases_conflict(struct file_
+ return locks_conflict(breaker, lease);
+ }
+
++static bool
++any_leases_conflict(struct inode *inode, struct file_lock *breaker)
++{
++ struct file_lock *fl;
++
++ lockdep_assert_held(&inode->i_lock);
++
++ for (fl = inode->i_flock ; fl && IS_LEASE(fl); fl = fl->fl_next) {
++ if (leases_conflict(fl, breaker))
++ return true;
++ }
++ return false;
++}
++
+ /**
+ * __break_lease - revoke all outstanding leases on file
+ * @inode: the inode of the file to return
+@@ -1323,10 +1337,9 @@ static bool leases_conflict(struct file_
+ int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
+ {
+ int error = 0;
+- struct file_lock *new_fl, *flock;
++ struct file_lock *new_fl;
+ struct file_lock *fl;
+ unsigned long break_time;
+- bool lease_conflict = false;
+ int want_write = (mode & O_ACCMODE) != O_RDONLY;
+
+ new_fl = lease_alloc(NULL, want_write ? F_WRLCK : F_RDLCK);
+@@ -1338,17 +1351,7 @@ int __break_lease(struct inode *inode, u
+
+ time_out_leases(inode);
+
+- flock = inode->i_flock;
+- if ((flock == NULL) || !IS_LEASE(flock))
+- goto out;
+-
+- for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) {
+- if (leases_conflict(fl, new_fl)) {
+- lease_conflict = true;
+- break;
+- }
+- }
+- if (!lease_conflict)
++ if (!any_leases_conflict(inode, new_fl))
+ goto out;
+
+ break_time = 0;
+@@ -1358,7 +1361,7 @@ int __break_lease(struct inode *inode, u
+ break_time++; /* so that 0 means no break time */
+ }
+
+- for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) {
++ for (fl = inode->i_flock; fl && IS_LEASE(fl); fl = fl->fl_next) {
+ if (!leases_conflict(fl, new_fl))
+ continue;
+ if (want_write) {
+@@ -1367,7 +1370,7 @@ int __break_lease(struct inode *inode, u
+ fl->fl_flags |= FL_UNLOCK_PENDING;
+ fl->fl_break_time = break_time;
+ } else {
+- if (lease_breaking(flock))
++ if (lease_breaking(inode->i_flock))
+ continue;
+ fl->fl_flags |= FL_DOWNGRADE_PENDING;
+ fl->fl_downgrade_time = break_time;
+@@ -1382,12 +1385,12 @@ int __break_lease(struct inode *inode, u
+ }
+
+ restart:
+- break_time = flock->fl_break_time;
++ break_time = inode->i_flock->fl_break_time;
+ if (break_time != 0)
+ break_time -= jiffies;
+ if (break_time == 0)
+ break_time++;
+- locks_insert_block(flock, new_fl);
++ locks_insert_block(inode->i_flock, new_fl);
+ trace_break_lease_block(inode, new_fl);
+ spin_unlock(&inode->i_lock);
+ error = wait_event_interruptible_timeout(new_fl->fl_wait,
+@@ -1396,17 +1399,15 @@ restart:
+ trace_break_lease_unblock(inode, new_fl);
+ locks_delete_block(new_fl);
+ if (error >= 0) {
+- if (error == 0)
+- time_out_leases(inode);
+ /*
+ * Wait for the next conflicting lease that has not been
+ * broken yet
+ */
+- for (flock = inode->i_flock; flock && IS_LEASE(flock);
+- flock = flock->fl_next) {
+- if (leases_conflict(new_fl, flock))
+- goto restart;
+- }
++ if (error == 0)
++ time_out_leases(inode);
++ if (any_leases_conflict(inode, new_fl))
++ goto restart;
++
+ error = 0;
+ }
+
diff --git a/debian/patches/series b/debian/patches/series
index 4ab96adb2..50a0f5f7a 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -252,6 +252,7 @@ bugfix/all/mm-mmap.c-expand_downwards-don-t-require-the-gap-if-.patch
bugfix/x86/mmap-remember-the-map_fixed-flag-as-vm_fixed.patch
bugfix/x86/mmap-add-an-exception-to-the-stack-gap-for-hotspot-jvm.patch
bugfix/all/locks-remove-i_have_this_lease-check-from-__break_le.patch
+bugfix/all/locks-__break_lease-cleanup-in-preparation-of-allowi.patch
# memfd_create() & kdbus backport
features/all/kdbus/mm-allow-drivers-to-prevent-new-writable-mappings.patch
--
2.15.1
From: Jeff Layton <jlayton@primarydata.com>
Date: Mon, 1 Sep 2014 14:53:41 -0400
Subject: locks: __break_lease cleanup in preparation of allowing direct
removal of leases
Origin: https://git.kernel.org/linus/03d12ddf845a4eb874ffa558d65a548aee9b715b
Bug-Debian: https://bugs.debian.org/883217
Eliminate an unneeded "flock" variable. We can use "fl" as a loop cursor
everywhere. Add a any_leases_conflict helper function as well to
consolidate a bit of code.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[carnil: backport for 3.16:
- adjust context
]
---
fs/locks.c | 49 +++++++++++++++++++++++++------------------------
1 file changed, 25 insertions(+), 24 deletions(-)
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1307,6 +1307,20 @@ static bool leases_conflict(struct file_
return locks_conflict(breaker, lease);
}
+static bool
+any_leases_conflict(struct inode *inode, struct file_lock *breaker)
+{
+ struct file_lock *fl;
+
+ lockdep_assert_held(&inode->i_lock);
+
+ for (fl = inode->i_flock ; fl && IS_LEASE(fl); fl = fl->fl_next) {
+ if (leases_conflict(fl, breaker))
+ return true;
+ }
+ return false;
+}
+
/**
* __break_lease - revoke all outstanding leases on file
* @inode: the inode of the file to return
@@ -1323,10 +1337,9 @@ static bool leases_conflict(struct file_
int __break_lease(struct inode *inode, unsigned int mode, unsigned int type)
{
int error = 0;
- struct file_lock *new_fl, *flock;
+ struct file_lock *new_fl;
struct file_lock *fl;
unsigned long break_time;
- bool lease_conflict = false;
int want_write = (mode & O_ACCMODE) != O_RDONLY;
new_fl = lease_alloc(NULL, want_write ? F_WRLCK : F_RDLCK);
@@ -1338,17 +1351,7 @@ int __break_lease(struct inode *inode, u
time_out_leases(inode);
- flock = inode->i_flock;
- if ((flock == NULL) || !IS_LEASE(flock))
- goto out;
-
- for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) {
- if (leases_conflict(fl, new_fl)) {
- lease_conflict = true;
- break;
- }
- }
- if (!lease_conflict)
+ if (!any_leases_conflict(inode, new_fl))
goto out;
break_time = 0;
@@ -1358,7 +1361,7 @@ int __break_lease(struct inode *inode, u
break_time++; /* so that 0 means no break time */
}
- for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) {
+ for (fl = inode->i_flock; fl && IS_LEASE(fl); fl = fl->fl_next) {
if (!leases_conflict(fl, new_fl))
continue;
if (want_write) {
@@ -1367,7 +1370,7 @@ int __break_lease(struct inode *inode, u
fl->fl_flags |= FL_UNLOCK_PENDING;
fl->fl_break_time = break_time;
} else {
- if (lease_breaking(flock))
+ if (lease_breaking(inode->i_flock))
continue;
fl->fl_flags |= FL_DOWNGRADE_PENDING;
fl->fl_downgrade_time = break_time;
@@ -1382,12 +1385,12 @@ int __break_lease(struct inode *inode, u
}
restart:
- break_time = flock->fl_break_time;
+ break_time = inode->i_flock->fl_break_time;
if (break_time != 0)
break_time -= jiffies;
if (break_time == 0)
break_time++;
- locks_insert_block(flock, new_fl);
+ locks_insert_block(inode->i_flock, new_fl);
trace_break_lease_block(inode, new_fl);
spin_unlock(&inode->i_lock);
error = wait_event_interruptible_timeout(new_fl->fl_wait,
@@ -1396,17 +1399,15 @@ restart:
trace_break_lease_unblock(inode, new_fl);
locks_delete_block(new_fl);
if (error >= 0) {
- if (error == 0)
- time_out_leases(inode);
/*
* Wait for the next conflicting lease that has not been
* broken yet
*/
- for (flock = inode->i_flock; flock && IS_LEASE(flock);
- flock = flock->fl_next) {
- if (leases_conflict(new_fl, flock))
- goto restart;
- }
+ if (error == 0)
+ time_out_leases(inode);
+ if (any_leases_conflict(inode, new_fl))
+ goto restart;
+
error = 0;
}
From: Jeff Layton <jlayton@primarydata.com>
Date: Mon, 1 Sep 2014 14:27:43 -0400
Subject: locks: remove i_have_this_lease check from __break_lease
Origin: https://git.kernel.org/linus/843c6b2f4cef384af8e0de6b7ac7191675030e3a
I think that the intent of this code was to ensure that a process won't
deadlock if it has one fd open with a lease on it and then breaks that
lease by opening another fd. In that case it'll treat the __break_lease
call as if it were non-blocking.
This seems wrong -- the process could (for instance) be multithreaded
and managing different fds via different threads. I also don't see any
mention of this limitation in the (somewhat sketchy) documentation.
Remove the check and the non-blocking behavior when i_have_this_lease
is true.
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
[carnil: Backport for 3.16:
- adjust context
]
---
fs/locks.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1326,7 +1326,6 @@ int __break_lease(struct inode *inode, u
struct file_lock *new_fl, *flock;
struct file_lock *fl;
unsigned long break_time;
- int i_have_this_lease = 0;
bool lease_conflict = false;
int want_write = (mode & O_ACCMODE) != O_RDONLY;
@@ -1346,8 +1345,7 @@ int __break_lease(struct inode *inode, u
for (fl = flock; fl && IS_LEASE(fl); fl = fl->fl_next) {
if (leases_conflict(fl, new_fl)) {
lease_conflict = true;
- if (fl->fl_owner == current->files)
- i_have_this_lease = 1;
+ break;
}
}
if (!lease_conflict)
@@ -1377,7 +1375,7 @@ int __break_lease(struct inode *inode, u
fl->fl_lmops->lm_break(fl);
}
- if (i_have_this_lease || (mode & O_NONBLOCK)) {
+ if (mode & O_NONBLOCK) {
trace_break_lease_noblock(inode, new_fl);
error = -EWOULDBLOCK;
goto out;
Reply to: