[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: nfs fails on some clients after power failure.



On Wed, 14 Nov 2007 11:38:52 -0800, David Brodbeck wrote:

> On Nov 14, 2007, at 8:49 AM, Hendrik Boom wrote:
>> Would anyone be able to suggest what's failing, and what to do  
>> about it?
>> Or what information I need to gather to diagnose the situation?
> 
> Try transferring a file between april and the problem machines with  
> FTP.  Both directions.  Sometimes network problems will allow small  
> ping packets to pass but will run into trouble with anything larger.   
> This is especially true of duplex mismatches.
> 
> If FTP works OK, make sure you haven't accidentally enabled a  
> firewall on one of the machines.  Several ports need to be open for  
> NTP to work, including portmap.
> 
> Check the logs on april to see if it's logging anything when the  
> other machines try to connect.

Thanks.  When I found your message I immediately got to work.

I look at the system log on april and nothing stands out.  So I
go to shadow and issue another mount /farhome request so as to
identify a time (now) where to look in the syslog, and it
mount instantly, with no delay, no timeout.

Damn.  It's intermittent.

The system log entry it generates says

Nov 14 15:04:28 april mountd[2662]: authenticated mount request from 172.25.1.13:898 for /farhome (/farhome)

I look back at the system log (not that I know to search for the word
'/farhome' and find a lot more messages like this from earlier in the day,
when I had been truing to mount /farhome on shadow (172,25,1,13).  The
only differences are the time and the port number. I find ports 782, 783,
786, 900, 856. and 871 mentioned in the earlier versions of this message.
I presume this is OK, though I wonder why the variation.  Could all those
old requests be outstanding, therefore blocking the use of those ports?

Mount requests from lovesong (which worked) contained its domain name
lovesong.topoi.pooq.com instead of its IP number 172.25.1.4.

I also found several occurrences of the following message, with different
IP numbers (including those of lovesong and shadow):

Nov 14 11:04:00 april exportfs[2642]: /etc/exports [1]: Neither 'subtree_check' or 'no_subtree_check' specified for export "172.25.1.4:/farhome".   Assuming default behaviour ('subtree_check'). NOTE: this default will change with nfs-utils version 1.1.0

I presume this isn't relevant, either, but mention it for completeness. 
They show up at reboot time.


Anyway, it looks as if it's working now, so I'll have to resume this
conversation next time it fails.  Unless there's something diagnostic I
can do while everything seems to be working, of course.

-- hendrik



Reply to: