Bug#177135: File browsers browsing SGI served NFS directories don't display all directories.
Package: general
Version: all kernels from 2.4.18 to 2.4.20, unstable distribution for 
last 6 months.
Severity: important
(Well, important for sites with SGI machines).
This one is difficult to explain and track. It effects many programs, 
such as mozilla, evolution and java apps., suggesting a library.
The actual bug is that file browsers on many application, most notably 
mozilla, but also some Java applications (like MagicDraw-5.5i) loose 
directories on an NFS service in an unpredictable way. That is, the 
directories are not displayed in the file browser. Which directories go 
missing is not predictable, but the lost directories are persistently 
lost until there is a change in the directory containing the "missing" 
entries. Then the missing directories may, or may not appear, depending 
on the phase of the moon :)
The NFS file service is being served by an SGI running IRIX 6.5. Output 
of uname on the NFS server is
IRIX <hostname> 6.5 10181059 IP32
I have only been able to reproduce this for SGI NFS servers, using both 
Version 2 and 3 protocols.
On the linux side, my kernel is 2.4.19-rc2, but the problem presists 
across many kernels.
The problem is not present in the current stable branch running kernels 
in the range that do display the problem on the unstable branch. This 
makes it unlikely to be a linux kernel thing. That it occurs in a 
variety of apps suggests a library somewhere that is not handling 
something stupid that the SGI is serving up as a directory listing.
The problem occurs on a variety of different linux machines. It is not 
hardware dependent.
To add to the mystery, ls works fine. Solid as a rock, as is file IO etc 
to/from the server. In all other respects except for these strange 
effects in some file browsers, the NFS is solid. (BTW take it for 
granted that all permissions and ownership issues  were considered and 
examined till the cows came home). Examining the way ls opperates, it 
seems to use open() with O_DIRECTORY flag rather than opendir() and 
friends. Maybe this is a start?????
Konqueror works fine. It does not have any problem scanning the 
directories on the server.
There seemed to be some coincidence that many directories that go 
missing have the same name as a username that is served over YP (or NIS 
these days). But that well may be a false lead, as we have encountered 
exceptions.
All in all, a really strange one.
libc6 is 2.3.1-9, but the problem has persisted through many versions of 
libc6. The problem pre-dates the move to gcc-3.2.2-0 on the current 
unstable branch.
Cheers,
_________________________________________________________
P. Hornby
CSIRO Exploration & Mining ,
AARC, 26 Dick Perry Ave,
Kensington, WA  6151  Australia
Phone +61 8 6436 8500 Fax   +61 8 6436 8555
Reply to: