[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Ultra5 successful install - PGX64 issues



Hi,

and sorry for the delay, I was a little short of spare time this week. :-/

On 04/15/2018 10:34 AM, Helge Deller wrote:
On 14.04.2018 20:13, Frank Scheiner wrote:
I know from my own testing that the following "smaller" machines work with Debian GNU/Linux Sid for hppa:

* 712/80
* c3700, c3750, J5600, rp2470
* c8000, rp3440

Apart from the rp3440 - and maybe also the 712/80 which showed some issue with it's built-in NIC after netbooting the Linux kernel and the OS

What kind of problems?

Unfortunately I seem to not have made any notes for the issue with the 712/80, so I retried with the assumed issue creating configuration earlier this week:

This configuration was using a Debian Linux kernel 4.9.25-1 (4.9.0-3-parisc from 2017-05-02). And when netbooting it, shortly after login the machine seems to loose contact to the NFS server:

```
[...]
[  OK  ] Started Serial Getty on ttyS0.
[  OK  ] Started Getty on tty1.
[  OK  ] Reached target Login Prompts.

Debian GNU/Linux buster/sid hp-712 ttyS0

hp-712 login: root
Password:
Last login: Thu Sep 18 11:30:50 CET 1902 from 172.16.1.1 on pts/0
Linux hp-712 4.9.0-3-parisc #1 Debian 4.9.25-1 (2017-05-02) parisc

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.

[  232.973913] nfs: server 172.16.0.2 not responding, still trying
[  233.094265] nfs: server 172.16.0.2 not responding, still trying
[  233.205127] nfs: server 172.16.0.2 not responding, still trying
[  233.568429] nfs: server 172.16.0.2 not responding, still trying
[  233.692383] nfs: server 172.16.0.2 not responding, still trying
[  233.808818] nfs: server 172.16.0.2 not responding, still trying
[...]
[  235.179253] nfs: server 172.16.0.2 OK
[  235.251896] nfs: server 172.16.0.2 not responding, still trying
[...]
```

Although it seems to be able to reconnect from time to time, the machine is not accessible.

Afterwards I found some older notes about this machine which mention no issues during diskless operation with the very same configuration (kernel and possibly also userland), which made me wonder, if there's maybe an issue between the machine's built-in NIC and my used 1000 Mbit network switch. And indeed, when connecting another 100 Mbit network switch in between the 712/80 and the 1000 Mbit network switch the issue seemed to be gone and the machine stayed accessible .

But later this week I retried the 712/80 with the current Linux kernel (4.15.x) and Debian userland and the issue hit me again, although much later and despite the 100 Mbit network switch in between. Looking at it I could see that the collision indicator was active on the switch for the port used by the 712/80. I then configured a singular port of the 1000 Mbit network switch to 10 Mbit full duplex and attached the 712/80 to it. And then the issue again seemed to be gone. But trying to install a package or updating the package cache again quickly triggered it. Well that's not that of an issue, as I can do the package management for the 712/80 with another machine (e.g. c8000).

Also interesting, the kernel messages for 4.15.11, please notice the time difference between "random: crng init done" and "Key type asymmetric registered":

```
[ 0.000000] Linux version 4.15.0-2-parisc (debian-kernel@lists.debian.org) (gcc version 7.3.0 (Debian 7.3.0-12)) #1 Debian 4.15.11-1 (2018-03-20) [ 0.000000] unwind_init: start = 0x1086e8b4, end = 0x108c5644, entries = 22233
[    0.000000] FP[0] enabled: Rev 1 Model 13
[    0.000000] The 32-bit Kernel has started...
[...]
[    9.919844] workingset: timestamp_bits=14 max_order=15 bucket_order=1
[   10.168866] zbud: loaded
[   56.112387] random: crng init done
[  433.392379] Key type asymmetric registered
[  433.445502] Asymmetric key parser 'x509' registered
[...]
[  544.565451] systemd[1]: Detected architecture parisc.

Welcome to Debian GNU/Linux buster/sid!
[...]
[  OK  ] Started Serial Getty on ttyS0.
[  OK  ] Started Getty on tty1.
[  OK  ] Reached target Login Prompts.

Debian GNU/Linux buster/sid hp-712 ttyS0

hp-712 login:

```

...On first try I assumed the machine or the kernel would hang, but no, it was still working all the time.

Today I tested it again (with 4.15.11) and the issue this time hit me already during login, after I entered the username.

So I'm actually back at where I'm started. :-(

I suspect that maybe the built-in 82596 NIC cannot cope with the amount of traffic that happens during diskless operation - although I then wonder why it doesn't have a problem during the TFTP operation to load the lifimage. Next thing I'll examine will be the parameters used for the NFS mount (especially for rsize and wsize) - if I ever can login to it again :-). And maybe a fan for the passive heat sink of the CPU which gets quite hot during operation.

Any suggestions on where to look else?

****

For the rp3440 I (also) have to retract my earlier statement as it looks like my second rp3440 actually **works** diskless. I have to retest with my first rp3440 (currently in storage) as it seems it behaves differently in this regard - or maybe I misconfigured something there in the past. I have to recheck.

But for my second rp3440 I still had to blacklist the `radeon` module to achieve this, as otherwise the system (console) seems to crash shortly before the login prompt would have appeared or just after. This is my used kernel command line as configured with palo 1.99 and Linux 4.14.x:

```
Current command line:
0/vmlinux HOME=/ root=/dev/nfs ip=:::::enp32s2:dhcp modprobe.blacklist=radeon initrd=0/ramdisk TERM=vt102 console=ttyS0
 0: 0/vmlinux
 1: HOME=/
 2: root=/dev/nfs
 3: ip=:::::enp32s2:dhcp
 4: modprobe.blacklist=radeon
 5: initrd=0/ramdisk
 6: TERM=vt102
 7: console=ttyS0
```

Interestingly after upgrading all packages (obviously including palo) on the NFS root FS and building a new lifimage with Linux 4.15.x, blacklisting the radeon module seems to be no longer required. Not sure if this is due to palo 2.00 or Linux 4.15.x. Anyways the radeon module is no longer loaded automatically with this configuration.

****

So actually at least also the rp3440 can work diskless - good that you asked, Helge. :-)

Cheers,
Frank


Reply to: