[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

"semop(1): encountered an error": the full story of the issue



TL;DR Blame systemd, this is not a hardware or rebuild issue.


Dear all,

Many of you have noticed that some packages fails to build with strange
errors in fakeroot, this has actually already been mentioned on the
mailing list [1]:

|    dh_lintian -a
|    dh_perl -a
| semop(1): encountered an error: Invalid argument
| make: *** [debian/rules:25: binary-arch] Error 1
| /usr/bin/fakeroot: 1: kill: No such process

The full build log is available [2] for those interested.

After a lot of difficulties to reproduce the issue and some
investigation (thanks for the help on IRC), it happens that this is due
to a bad interaction between the way the manual signing is done and
systemd:

- We can't use autosigning on the build daemons as they are not managed
  by DSA at this stage [1]. For that we first need to finish the
  rebootstrap (i.e. getting an empty list of imported packages [4]), and
  have an agreement with the release team that riscv64 will be added to
  testing. Note that autosigning was used with the debian-ports archive.

- In the past manual signing was done through email, the build log was
  sent to the buildd admin, who replied with the signed changes file.
  The addition of the buildinfo files, which also need to be signed,
  broke this process.

- I therefore implemented manual signing with a script which basically
  does debsign -r on changes files, just using rsync to avoid many scp
  transfers. For that I connect to the buildds with an SSH key to the
  buildd@ account.

- The buildds tries to be as close as possible to the DSA setup, which
  uses an UID allocated through LDAP for the buildd user, and therefore
  this is not a system user.

- fakeroot is used during the binary phase of the builds, except for
  packages which opt-out [5]. fakeroot uses IPC objects (semaphores and
  message queues) to communicate between a daemon which centralize the
  permissions and the processes.

- The RemoveIPC=yes in /etc/systemd/logind.conf default to "yes" [6].
  This means each time the manual signing was finished, the login
  session was cleaned, and the IPC objects where deleted. If fakeroot
  was in use at that moment the daemon stopped being able to communicate
  with the processes, causing the "semop(1): encountered an error:
  Invalid argument" message. Depending on the package, the part run
  under fakeroot is more or less long.

For now the issue has been workarounded by configuring RemoveIPC to "no.
The real fix will be to hand over the build daemons to DSA so that they
can use autosigning.

Regards,
Aurelien

[1] https://lists.debian.org/debian-riscv/2023/08/msg00013.html
[2] https://buildd.debian.org/status/fetch.php?pkg=dash&arch=riscv64&ver=0.5.12-6%2Bb1&stamp=1690523392&raw=0
[3] https://ftp-master.debian.org/wiki/projects/autosigning/
[4] https://ftp-master.debian.org/users/mhy/riscv64import.txt
[5] https://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-rules-requires-root
[6] https://manpages.debian.org/testing/systemd/logind.conf.5.en.html

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                     http://aurel32.net

Attachment: signature.asc
Description: PGP signature


Reply to: