[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

readonly NFS root: udev means can't use stock kernel? (long)



I'm trying to set up a NFS read-only root filesystem; what's more I'm
trying to do it using absolutely standard kernels! This is important
because it is possible that I will be have to repeat much of this on
RHEL for application support reasons, which in turn will also mean
keeping stock kernels. But for all the obvious reasons I'm trying it
on Debian first.

The PXE/DHCP/TFTP parts are all fine, I cloned an existing machine into
a chrooted environment, stripped it down and now and I can boot the OS
and everything looks pretty good, but there are number of irritations,
any advice anyone can give is much appreciated!

('ga010133vm3' is my test NFS root client. '134.171.27.236' is my NFS
root server.)

root mounted three times!
-------------------------

ga010133vm3# df
Filesystem                    1K-blocks      Used Available Use% Mounted on
rootfs                         14721376   8549152   5424384  62% /
udev                              10240        24     10216   1% /dev
134.171.27.236:/diska/nfsroot  14721376   8549152   5424384  62% /
134.171.27.236:/diska/nfsroot  14721376   8549152   5424384  62% /dev/.static/dev
tmpfs                            128488        80    128408   1% /tmp
ga010133vm3#

The first entry ('rootfs ...') I guess is the one added by the kernel
itself because of the parameters it sees on the command line, in turn
provided by the pxe.config file, but why isn't that root fs pivoted out 
(or whatever it is that happens these days) as it mounts the desired
root fs 134.171.27.236:/diska/nfsroot on / ?

I added 'noauto' to the entry for / in /etc/fstab but that makes
no difference at all, presumably because there is an explicit mount
request for / somewhere (initrd getting the root device off the kernel
command line maybe?) What is the "correct" fix? Presumably somehow
to tell the initrd to pivot it out a bit better?

The second entry is the correct one.

The third entry I presume is being put there as udev moved what it
sees as /dev out of place before it mounts its own fs there. Though
my non NFS-root Debian machines don't have this!

Because of the dependencies for the stock Debian kernels, I cannot
uninstall udev, but I think it would make no difference even if I
did because it looks like udev is in the initrd image, so even when
I specify:

ga010133vm3# grep static /etc/udev/udev.conf
no_static_dev="true"
ga010133vm3#

Byt it makes no difference: the nfsroot is still mounted on
/dev/.static/dev.  

What's going on? What is the "Debian way" to get root mounted
three times? Like I said I want to stick with stock kernels, and
I want to edit /etc/init.d files as little as possible in order
to keep the system easily updatable. 

What would be perfectly acceptable - if there is no cleaner 
alternative would be the addition of /etc/init.d/sort-it-all-out
to sort the mess out; but then I guess it's too late to go pivoting
the mounted root fs out of the way when other init.d scripts have
already started stuff on it (e.g. syslogd).

Incidentally, I tried the nfsbooted package, and that also resulted
in it being mounted three times.

/dev/null does not exist when sshd starts
-----------------------------------------

ga010133vm3# sshd
ga010133vm3# pgrep sshd
ga010133vm3# grep daemon /var/log/syslog
Jan 11 18:22:44 ga010133vm3 sshd[1411]: fatal: daemon() failed: No such device
ga010133vm3#

A google of this error message showed that sshd died because /dev/null
did not exist, and a look around /etc/init.d turned up this in udev:

# When modifying this script, do not forget that between the time that
# the new /dev has been mounted and udevtrigger has been run there will be
# no /dev/null. This also means that you cannot use the "&" shell command.

which seems to suggest udev is not behaving very well in this
environment.

DHCP not passing hostname?
--------------------------

I'm using dhcp3 as the client and as the server. I see that if
'auto eth0' is in /etc/network/interfaces then there is the network
connection is lost (presumably NIC downed) prior to the call for
an IP address, and since / is NFS mounted at that point things go
quickly wrong.

But ifup also thinks that the NIC is not up yet; for purely cosmetic
reasons it would be nice if it understood it was up. One fix would be
to hack its state file, but nicer seemed to add 'script "/bin/true";'
to /etc/dhcp3/dhclient.conf so that the call to dhclient is a
no-op. This works fine.

Prior to doing that, I set the script to be a one liner:

#!/bin/sh
env >> /tmp/log

I did this in order to see if I could modify /etc/init.d/hostname.sh
to use information provided by the DHCP to set the hostname; using
/etc/hostname to set the hostname would mean that I could not let a
second client mount the same NFS root filesystem. But oddly I did
not see any hostname in the log file; I see a lot of other stuff:
IPs, gateways, DNS servers, but no hostname.

In the end I modified the hostname.sh to include:

HOSTNAME=$(/usr/bin/host $(ifconfig eth0 | sed -n 's/.*inet addr:\([^ ]*\).*/\1/p') | sed -n 's/.* pointer \([^\.]*\).*/\1/p')

I.e. to look its own IP up in DNS and get its own hostname that way. 

But it ought to be possible to set it from info recieved via
DHCP; either the call the kernel makes because of its 'ip=dhcp'
kernel parameter, or because of the the call I'm allowing
/etc/init.d/networking to make by having 'auto eth0' ?

Any advice anyone can offer please? Many thanks!

Alexis



Reply to: