[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Squeeze: sometimes, bind times out (backgrounded) at boot time - Solved







If the hardware isn't completely identical then it is reasonable to
have differences in the parallel boot timings.

Theoretically, the machines were identical, but I haven't inspected them to make sure. The fact was: timing to suceed binding to NIS server was quite different from one machine to the other.
 
Using it with NIS/YP is not so common so I think it
not unlikely that there is a bug related to it there.

It turned out there was no real bug (see below).
 
That seems like a completely separate issue.  Probably should separate
the two problems and address each one individually.  Would be happy to
help with the DNS configuration too.  Describe how it is set up and
the list could provide feedback on how to improve it.

Knowing that there are people ready to help out there always makes me feel good about the community. Thank you for your willingness to help. However, DNS is corporate infrastructure business, and it is out of my scope.
 
DNS is a marvelously designed distributed database system... ... But it is only as good as the configured network around it.

Indeed! :-)
 
Try this experiment.  At the last point in the /etc/init.d/nis startup
script add a short sleep.  That will give the daemons time to finish
and get ready to go.  It is possible that they are not yet quite ready
yet and so immediately after the end of the script the next one to run
hits them too early.

I suggest changing this in file /etc/init.d/nis:

  case "$1" in
    start)
          do_start
          ;;
    stop)

To this as an experiment:

  case "$1" in
    start)
          do_start
          sleep 5   # <-- Add this sleep to give things more time.
          ;;
    stop)


Bingo!!!!  Haven't done exactly that, but your suggestion helped me to understand NIS init script a bit better. So, I just increased the already existent "wait for bind to succeed" loop maximum count, from 10 seconds to 20 seconds, and that did the trick. Even the slowest or our machines boot properly now. The change was as shown below:

The init script was:

--------------
bind_wait()
{
[ "`ypwhich 2>/dev/null`" = "" ] && sleep 1

if [ "`ypwhich 2>/dev/null`" = "" ]
then
bound=""
log_action_begin_msg "binding to YP server"
for i in 1 2 3 4 5 6 7 8 9 10
do
sleep 1
log_action_cont_msg "."
if [ "`ypwhich 2>/dev/null`" != "" ]
then
echo -n " done] "
bound="yes"
break
fi
done
# This should potentially be an error
if [ "$bound" ] ; then
log_action_end_msg 0
else
log_action_end_msg 1 "backgrounded"
fi
fi
}
--------------
... and I changed that to:
--------------
bind_wait()
{
[ "`ypwhich 2>/dev/null`" = "" ] && sleep 1

if [ "`ypwhich 2>/dev/null`" = "" ]
then
bound=""
log_action_begin_msg "binding to YP server"
for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
do
sleep 1
log_action_cont_msg "."
if [ "`ypwhich 2>/dev/null`" != "" ]
then
echo -n " done] "
bound="yes"
break
fi
done
# This should potentially be an error
if [ "$bound" ] ; then
log_action_end_msg 0
else
log_action_end_msg 1 "backgrounded"
fi
fi
}

Now, this will suffice, for me, for now. I'll have to keep an eye on this script every time I perform system updates, but machines will boot properly, until the IT crew manages to figure out what delaying the binding to NIS process. I'm assuming this is not a real debian/squeeze issue.

Thank you very much for your help,

João

Reply to: