[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#401916: Bug 401916: analysis and suggested solution



On Mon, Feb 19, 2007 at 07:21:08PM +0100, maximilian attems wrote:
heya david,

Hey Maks :)

On Fri, 16 Feb 2007, David Härdeman wrote:
Short-term solution:

Therefore, I think the best short-term solution (considering the
ever-impending Etch release) would be to add the "root_wait=" boot
parameter so that affected users can set the timeout value manually. If
that parameter was added, and documented in the release docs, the severity
of these bugs could be downgraded (imho).

well we already check the rootdelay variable and it could easily be
exported and checked by the udev hook.
please no new boot variables. also aboves is the original meaning of
rootdelay, just currently "perverted" it's usage.
so yes this can be done easily.

Oh, I missed that variable...I've written a patch now to export it.

Alternatively, or additionally, the scripts could check whether one of
several "problematic" modules have been loaded when udevsettle returns and
if so, sleep a couple of extra seconds (most other distros that take this
approach seem to wait around 6 - 10 seconds). The problem is that the list
of problematic modules is potentially huge (see list of buses above)

additionally sounds like a good idea, plus an extra udevsettle call.
please cook up a patch for mika.

I've attached a patch against the udev and initramfs-tools source packages that implement the following changes...please review:

1) ROOTDELAY is exported by initramfs-tools and used in udev if set

2) Checks in udev for scsi/firewire/usb have been added and will add 10 seconds of sleep if found

3) The module scsi_wait_scan will be loaded by udev if scsi is detected, modprobe should only return from loading that module once all scsi busses have been scanned (I found that module yesterday...pretty nifty band-aid solution to the problem, does not help with usb/firewire though).

The only problem with the approach is that a large majority of all machines have usb which means that we'll slow down the boot for all those machines even though a small minority are affected.

Thanks to Mika for the preliminary testing done so far, it has helped my understanding of the underlying problem...

Long-term solution:
...
Take all scripts under /usr/share/initramfs-tools/scripts/local-top/ that
setup block devices (i.e. cryptsetup, lvm, evms, etc), and split them in
two, a udev rule snippet and a script.
...
Then the main init script is changed to sleep until $ROOT (not /dev/root
but whatever is set as the $ROOT variable) appears

i agree that this was a possible and probable plan i thought of.
disadvantage currently you can exchange udev with some simple hotplug
script out of initramfs-tools and everything will work fine.

If you want a static setup you could still call all those scripts with some static arguments (e.g. /dev/hda, /dev/sda)

also the idea is to have an MODULES=MOST target that would
just add/run the needed modules and thus not include udev,
than aboves approach is in trouble.

How would MODULES=MOST create stuff under /dev then?
so i'd put that up for discussion and we'll have enough time
to figure that out postetch.

Agreed.

--
David Härdeman
diff -Nur initramfs-tools-0.85e-orig/init initramfs-tools-0.85e/init
--- initramfs-tools-0.85e-orig/init	2006-11-03 09:03:44.000000000 +0100
+++ initramfs-tools-0.85e/init	2007-02-19 21:05:47.000000000 +0100
@@ -46,6 +46,7 @@
 export debug=
 export cryptopts=${CRYPTOPTS}
 export panic
+export ROOTDELAY=
 
 for x in $(cat /proc/cmdline); do
 	case $x in
diff -Nur udev-0.105-orig/extra/initramfs.hook udev-0.105/extra/initramfs.hook
--- udev-0.105-orig/extra/initramfs.hook	2006-05-16 18:16:49.000000000 +0200
+++ udev-0.105/extra/initramfs.hook	2007-02-19 20:54:13.000000000 +0100
@@ -16,6 +16,9 @@
 # udevd uses unix domain sockets for communication
 force_load unix
 
+# this is used to ensure that SCSI disks have been scanned
+manual_add_modules scsi_wait_scan
+
 cp -a /etc/udev/ $DESTDIR/etc/
 cp /etc/scsi_id.config $DESTDIR/etc/
 rm -f $DESTDIR/etc/udev/rules.d/*_hotplugd.rules # XXX
diff -Nur udev-0.105-orig/extra/initramfs.premount udev-0.105/extra/initramfs.premount
--- udev-0.105-orig/extra/initramfs.premount	2006-12-19 11:19:23.000000000 +0100
+++ udev-0.105/extra/initramfs.premount	2007-02-19 21:08:03.000000000 +0100
@@ -20,5 +20,45 @@
 udevtrigger
 udevsettle || true
 
+# Check for problematic devices
+problem=0
+
+# USB / FireWire
+if $(grep -q "usb\|ieee1394" /proc/devices); then
+	problem=1
+fi
+
+# SCSI
+if [ -e "/proc/scsi" ]; then
+	modprobe -q "scsi_wait_scan" || problem=1
+	udevsettle || true
+fi
+
+if [ ${problem} -gt 0 ]; then
+	if [ -z "${ROOTDELAY}" ]; then
+		ROOTDELAY=10
+	fi
+fi
+
+# Optionally, wait a user-defined number of seconds
+if [ -z "${ROOTDELAY}" ]; then
+	slumber=0
+else
+	slumber=${ROOTDELAY}
+fi
+
+if [ ${slumber} -gt 0 ]; then
+	log_begin_msg "Waiting for additional devices..."
+	if [ -x /sbin/usplash_write ]; then
+		/sbin/usplash_write "TIMEOUT 0" || true
+	fi
+	/bin/sleep ${slumber}
+	udevsettle || true
+	if [ -x /sbin/usplash_write ]; then
+		/sbin/usplash_write "TIMEOUT 15" || true
+	fi
+fi
+
+
 # Leave udev running to process events that come in out-of-band (like USB
 # connections)

Reply to: