[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

NFS re-export crashes client



I figure this has to be as good a forum as any to start with :).
Potato machine 'ruth' is re-exporting some directories served by
an SGI O200 using the userspace NFS daemon to 'r001'. Whenever
a process on r001 tries to read certain 'cluttered' directories
(like most well-used home directories), however, it fills all
memory and dies. Strace on r001 shows an endless series of 'getdents',
while strace on ruth shows just two (typically). On one test, it
crashed r001 and wedged the nfs daemon on ruth. I'd have just filed
a bug report, but I can't decide who's bug it might be :).

Some background as to why I want to do this:
I've got a small cluster (r00[1-5]) of diskless systems booting 2.2.14
from a ruth, and mounting root from it as well. Since
they are only intended to be a compute farm, the only 'outside'
network connection is via the ruth, and you must log into
ruth before continuing on into the nodes. This is so that they
can communicate with each other quickly and easily with few
security checks on that subnet (set up as 192.168.0.x). Home
directories and data files are stored on an SGI O200. It is
convenient to mount those on the cluster nodes, and the easiest
way I see to do it is to have the master re-export the home directory
mountpoints to the cluster using the user-space NFS server (file
access speed isn't particularly important, since the bulk of 
compute jobs are CPU and memory limited).

Thanks for any suggestions,
Andy Roosen

-- 
- Andrew R. Roosen, Ph.D.
- Computer Operations Administrator
- Center for Theoretical and Computational Materials Science
- National Institute of Standards and Technology
-
- Andrew.Roosen@nist.gov
- http://www.ctcms.nist.gov/~roosen/


Reply to: