[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1115309: elpa-magit: magit-branch fails with error message



Hi,

Christoph Groth <christoph@grothesque.org> writes:

> Yes, moving the native compilation cache makes the problem disappear!
> (And moving it back makes it reappear...)
>
> Thanks!
>
> Now I understand why the other local user was not affected.

Wonderful, you're welcome, and understanding is treasure! :)

> Before I upgraded to trixie, I was using magit and transient from elpa,
> but then I upgraded to elpa-* packages to simplify things.  I guess that
> my package version history is pretty unique and as such the particular
> manifestation I observed might be very rare.

Thank you very much for this; it brings us closer to figuring out a
reproducer.  Maybe this time we'll finally figure out how to trigger it
(hopefully there's only one bug!) reliably, which of course will point
to what actually cause is, and thus towards a resolution.

At what point did you switch from ELPA-from-upstream-packages to
Debian-provided elpa-* packages?  I'm guessing it's this:

  1. Bookworm's Emacs 29.4 with Debian-provided-packages.
  2. Installed Emacs 30.1 from bookworm-backports, which broke many
  packages because compatible backports of those packages were not
  provided.
  3. So you solved this by deinstalling the affected Debian-provied
  elpa-* packages and installing ELPA-from-upstream-provided ones.
  4. Upgraded to trixie.
  5. Did you run Emacs and activate magit at this point?
  6. How did you deinstall ELPA-from-upstream-provided packages?
  7. Installed Debian-provided elpa-* packages.
  8. Opened Emacs and...breakage (in at least magit)

> I kept the eln-cache-broken directory in case it can somehow serve to
> debug the deeper problem at play here.  Please let me know if there’s
> a way in which we can help Emacs upstream here.

Thank you!  I'm curious about a couple of things, but let's start with
this:  Do you have the following eln-cache subdirs (if you include full
output, please `ls -l`)?

  29.4-6c7920b0, 30.1-9b0c1374, 30.1-e3a6a941
  # 30.1-9b0c1374 is for the bookworm-backport on my system

Then, let's move you good eln-cache someplace safe, and then copy
eln-cache-broken to eln-cache to resurrect the bug.  Then

  $ strace -e open emacs > ~/.emacs-eln-bug.strace 2>&1

to see if Emacs is opening transient's native-comp bits from the right
cachedir.  The most efficient way to attack this problem is probably
something like

  1. Search for "30.1-9b0c1374", the bookworm-backport eln-cache
  2. Search for a line that looks something like:

  openat(AT_FDCWD, "/home/user/.emacs.d/eln-cache/30.1-e3a6a941/transient-ff32346b-0aab668a.eln", O_RDONLY|O_CLOEXEC) = 9

  Please use some kind or regex, pattern, or glob-enabled search for
  this!  I expect that you'll find that Emacs is loading from the
  correct cache dir, so this test is mostly to establish a baseline.

The test after this will be to determine why our elpa-{transient,
magit*,etc} packages didn't invalidate the cache generated from
ELPA-from-upstream native compilation...that's a much harder thing to
debug.  I'm guessing the resolution to that will probably look like
adding a method where the ELPA-source injects a source-cookie (GNU ELPA,
MELPA, dh-elpa, etc) so Emacs can use ELPA-source-cookie to invalidate
the eln-cache.  Foo-pkg.el seems like a convenient place for this.

Cheers,
Nicholas

Attachment: signature.asc
Description: PGP signature


Reply to: