[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Serial lockup during smp boot [Sarge]



Hi.

I'm running getty. Still, if I remove the getty on ttyS0, the result remains the same. The lockup occurs somewhere after init has started (as previously) and remains until init 2 is run (when the syslogd/klogd kicks in).

I am willing to bet on interrupt problems. As it seems, 16 bytes is exactly the size of the HW FIFO of the motherboard. An input character triggers the serial interrupt handler, which in turn discoveres that there is more data to be transmitted and starts the uart tx. However, the mechanism for filling the FIFO when it becomes empty again, seems to fail. - But this is only my educated guess / theory.

BTW /proc/interrupts looks like this:
          CPU0   CPU1
4:        368       0   IO-APIC-edge serial

Another strange behaviour is that I've seen console output during the 16-byte-output-per-input-character period that are normal. I can for example sit and press space slowly and see the 16 byte progress. But then suddenly the "e100: eth0: e100_watchdog: link up, 100Mbps, full-duplex" message shoots out at once! But I have to keep pressing a key in the serial console to continue the "normal" boot sequence.

Is there some settings, like APIC or similar, to the kernel, than can be used to control how the interrupts are handled?


Svein


Alexey Lobanov wrote:
Hello.

Two ideas.

1. What program is bound to ttyS0 in inittab? *getty?  Nothing? What if
to change this (something to nothing or vice versa)?

2. Damned IBM PC interrupts?  It may be deja vue, but it seems me that I
had same effect in Novell server 10 years ago; the reason was incorrect
combination of interrupts in multiple ISA network cards.
/proc/interrupts, dmesg?

Alexey

On 05/10/05 22:35, Svein Seldal wrote:


Hi.

I have some strange problems during boot of my system:

I'm running stable (Sarge) with kernel-image-2.6.7-2-686 plus ditto-smp.
I have a Intel machine with a Intel P4 (family 15, model 4) @3GHz. I
have 1G of memory. My SATA HDD's are running software RAID-1 with LVM
(but that's probably irrelevant of this error).

I'm running serial console as the main console. Hence this line is found
in by grub config: "kernel /vmlinuz-2.6.8-2-686-smp
root=/dev/mapper/System-Root vga=0x31A console=ttyS0,38400 ro"

If I boot my -smp kernel with this config, the serial console output
(and boot process) locks up after a while. It boots the kernel, starts
userspace apps as expected. It locks up approx. when its doing EXT3FS
mounts. The machine is then apparently dead, nothing more happens.
However, if I then type any charater in my serial console, it will then
output the next 16 bytes of the serial console output. Press a key, and
new 16 charaters from the console output are retured. And it will keep
on doing this until init 2 is run (apparently until the syslogd and
klogd are started)

This behaviour is not observed when I use the tty0 as console, neither
when I boot the non-smp kernel. Nor does it lockup on shutdown.

Does anyone know about this "feature"? Is this something to submit a
bugreport about?


Regards,
Svein Seldal





Reply to: