Re: Fwd: Re: Debian on mac68k

To: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Stefan Niestegge <beetle@abbuc.de>, debian-68k@lists.debian.org
Subject: Re: Fwd: Re: Debian on mac68k
From: Finn Thain <fthain@telegraphics.com.au>
Date: Sun, 5 Jun 2016 15:25:24 +1000 (AEST)
Message-id: <[🔎] alpine.LNX.2.00.1606051154190.30108@nippy.intranet>
In-reply-to: <[🔎] 2b271993-22b6-9407-e421-a79561e77c17@physik.fu-berlin.de>
References: <574C132F.3070404@abbuc.de> <574C15C2.90908@abbuc.de> <ec5ded9d-b3c0-0bba-65b7-1e4d2cc063a2@physik.fu-berlin.de> <574C1792.4030403@abbuc.de> <a6490d67-9b17-30a7-73b4-01dfcf0cc0c0@physik.fu-berlin.de> <574C1FB8.4090800@abbuc.de> <81744fd9-36b7-2181-4f9c-d44160f7f966@physik.fu-berlin.de> <alpine.LNX.2.00.1605302324490.8210@nippy.intranet> <[🔎] alpine.LNX.2.00.1606031303210.6355@nippy.intranet> <[🔎] 8d419bd1-4de1-5f4d-85ae-533fdbc51de8@physik.fu-berlin.de> <[🔎] alpine.LNX.2.00.1606041050510.24670@nippy.intranet> <[🔎] 2b271993-22b6-9407-e421-a79561e77c17@physik.fu-berlin.de>

On Sat, 4 Jun 2016, John Paul Adrian Glaubitz wrote:

> On 06/04/2016 07:03 AM, Finn Thain wrote:
> 
> > Yes, systemd is very slow to boot on this system, so timeouts are 
> > likely to be the problem.
> 
> The timeouts are related to dbus, not to systemd.

I thought you said, "dbus timeout in systemd" ?!

> I also do not think that sysvinit would actually be faster. sysvinit 
> runs shell code for boilerplate stuff while systemd does that with 
> built-in C code and I don't think we disagree here that bash code with 
> lots of forking is always slower than compiled C code.

You can see from my post that booting systemd to login prompt forked 170 
processes. So I think a claim that systemd or sysvinit or openrc is faster 
or slower needs to be benchmarked, with current packages. Things have 
bloated up since sysvinit in etch-m68k (and its deps).

> 
> I have, in fact, used systemd on several Amigas and it was faster than 
> sysvinit, even on my old trusty A1200 with 68030/50 CPU and 64 MiB. If I 
> remember correctly, I was also using systemd on my Centris 650 but I 
> cannot say for sure, the machine is in the basement at the moment.

Boot time doesn't matter that much to me because my kernel testing usually 
involves busybox, not Debian. So I'm not going to benchmark the 
alternatives.

> For a fair comparison, you would actually have to sum over all the 
> instances of bash, sed, grep, awk and whatnot that sysvinit is starting 
> in to order to be able to boot the machine. I'm pretty sure those will 
> add up to more than two minutes of CPU time.

I was not comparing the CPU cost of systemd with any of the alternatives. 

I was merely pointing out that timeouts are likely to be a problem on a 25 
MHz 68030 where the _lower_bound_ on CPU consumption is over two minutes.

> > 1970-01-01 04:45:26 (0.00 B/s) - Read error at byte 0/63850 (Invalid 
> > argument). Retrying.
> 
> Hmm, that's very odd. I haven't seen similar issues on my m68k machines. 
> Is there a way to verify that the kernel itself works as expected?

If I boot etch-m68k with the same kernel, the wget error is the same. It 
still happens with the 'ipv6' module loaded. I'll have to do more 
debugging.

The apt-get crash happened under Linux v4.1.23, but I probably should use 
something newer. For example, when I upgrade to v4.7-rc1, the following 
crash goes away:

root@pacman:~# 
root@pacman:~# systemctl
Unable to handle kernel access at virtual address d0083000
Oops: 00000000
Modules linked in:
PC: [<0015a08e>] __generic_copy_to_user+0x22/0x40
SR: 2000  SP: 00e8fca0  a2: 0115ef80
d0: 00000ee6    d1: 00000000    d2: 61646168    d3: 00004000
d4: 00004000    d5: 00598000    a0: 00598468    a1: d0083002
Process systemctl (pid: 173, task=0115ef80)
Frame format=B ssw=0729 isc=0801 isb=0001 daddr=d0083000 dobuf=61646168
baddr=0015a092 dibuf=61646168 ver=f
Stack from 00e8fd28:
        00004000 0015e50c d0082b9a 00598000 00004000 00006000 000000ba 000000ba
        00004000 00000000 000040ba 00ebf630 00e8ff54 0015e440 00017f7c 0015a06
        001dc066 0036f960 00000000 00004000 00e8ff54 ffffffa1 000060a2 00000000
        00fd726c 000060a2 00000000 00fd7200 00ebf630 00ebf648 00e8fe34 40004040
        0023677c 00ebf630 00000018 00e8ff54 000060a2 40004040 ef865860 ef866b54
        ef866b5c 00000000 017d7541 00e8ff4c ef865844 013b0000 00000000 00000000
Call Trace: [<00004000>] do_rt_sigreturn+0xbc/0x382
 [<0015e50c>] copy_page_to_iter+0xcc/0x1c2
 [<00004000>] do_rt_sigreturn+0xbc/0x382
 [<00006000>] do_page_fault+0xe0/0x1d0
 [<00004000>] do_rt_sigreturn+0xbc/0x382
 [<000040ba>] do_rt_sigreturn+0x176/0x382
 [<0015e440>] copy_page_to_iter+0x0/0x1c2
 [<00017f7c>] warn_slowpath_null+0x0/0x1c
 [<0015a06c>] __generic_copy_to_user+0x0/0x40
 [<001dc066>] skb_copy_datagram_iter+0xb2/0x14a
 [<00004000>] do_rt_sigreturn+0xbc/0x382
 [<000060a2>] do_page_fault+0x182/0x1d0
 [<000060a2>] do_page_fault+0x182/0x1d0
 [<0023677c>] unix_stream_recvmsg+0x376/0x568
 [<000060a2>] do_page_fault+0x182/0x1d0
 [<001d21ba>] copy_msghdr_from_user+0xf6/0x104
 [<001d224a>] ___sys_recvmsg+0x82/0xea
 [<000060a2>] do_page_fault+0x182/0x1d0
 [<0006b4a8>] expand_stack+0x0/0x34
 [<00037d62>] rcu_bh_qs+0x0/0x2e
 [<0006e996>] page_add_new_anon_rmap+0x62/0x6c
 [<00068206>] handle_mm_fault+0x644/0x944
 [<000060a2>] do_page_fault+0x182/0x1d0
 [<0006b4a8>] expand_stack+0x0/0x34
 [<0008c614>] __fdget+0xc/0x10
 [<001d22e6>] __sys_recvmsg+0x34/0x5c
 [<00006000>] do_page_fault+0xe0/0x1d0
 [<00001130>] kernel_pg_dir+0x130/0x1000
 [<001d2320>] SyS_recvmsg+0x12/0x18
 [<001d1b4a>] SyS_socketcall+0x1e8/0x22e
 [<00019468>] do_exit+0x49e/0x6c6
 [<00002918>] syscall+0x8/0xc
 [<0008c002>] notify_change+0x152/0x284

Code: 7403 c282 4a80 670a 2418 0e99 2800 5380 <66f6> 0801 0001 6706 3418 
0e59 2800 0801 0000 6706 1418 0e19 2800 241f 4e75 2f02
Disabling lock debugging due to kernel taint
note: systemctl[173] exited with preempt_count 1
Segmentation fault
root@pacman:~# 

This is clearly a kernel bug. I have noticed that 68030 systems don't seem 
to get much testing. That is the best explanation I have for the crashes.

I wonder if anyone has any experience with Debian on Hatari (i.e. emulated 
68030 with MMU)...

> I haven't seen any particular issues with segmentation faults on m68k 
> unless there was a problem with the hardware.

No, the hardware seems to be fine. I ran md5sum on a 500 MB SCSI disk and 
gave the right answer. It took hours. There's no flakiness here, the bugs 
are easily reproducible. This logic board has new capacitors, BTW.

> > 
> > Thanks. Perhaps the new debootstrap will fix the wget error.
> 
> I haven't got around doing that yet.

No rush.

> 
> > ntpdate is always useful for old machines that may have no clock 
> > battery (as old batteries tend to leak acid) or have no RTC support in 
> > the kernel, which is the case for certain Mac models.
> 
> I have actually made much better experiences with systemd-timesyncd 
> which I have replaced ntpdate with on my buildds. Works much more 
> reliably and was very easy to set up (the Arch Wiki has all the 
> details):

A daemon is not worth the overhead. Again, my package preferences are 
unlikely to be appropriate for most p.d.o systems.

> Good to know. Could you maybe file a bug report against src:linux in the 
> Debian bug tracker? Ben Hutchings will happily incorporate the change, 
> he is Debian's main kernel maintainer.

Will do.

--

Reply to:

Follow-Ups:
- Re: Fwd: Re: Debian on mac68k
  - From: Finn Thain <fthain@telegraphics.com.au>

References:
- Re: Fwd: Re: Debian on mac68k
  - From: Finn Thain <fthain@telegraphics.com.au>
- Re: Fwd: Re: Debian on mac68k
  - From: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
- Re: Fwd: Re: Debian on mac68k
  - From: Finn Thain <fthain@telegraphics.com.au>
- Re: Fwd: Re: Debian on mac68k
  - From: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>

Prev by Date: Re: Debian on mac68k
Next by Date: [Stretch] Status for architecture qualification
Previous by thread: Re: Fwd: Re: Debian on mac68k
Next by thread: Re: Fwd: Re: Debian on mac68k
Index(es):
- Date
- Thread