[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Problem chain loading Jessie's pxelinux via http using iPXE (failed to load ldlinux.c32)



Hi,

> I have an iPXE setup that allows me to chain load the Debian installer
> pxelinux setups over http so I don't need to mirror all the things
> locally, but so I still get the nice installer boot menu
> 
> The iPXE config is essentially this:
> set 209:string cfg/pxelinux.cfg
> set 210:string
> http://ftp.nl.debian.org/debian/dists/${debian_version}/main/installer-${debian_arch}/current/images/netboot/
> chain ${210:string}pxelinux.0
> 
> Until Jessie it's worked fine for years. pxelinux.0 is downloaded and
> having already set the two dhcp options above, it happily manages to go
> and find the pxelinux.cfg/default file and then all of the other things
> required by it and I get the installer boot menu.
> 
> With Jessie however, it seems to get pxelinux.0 ok, but then fails
> miserably at finding ldlinux.c32. It simply tells me Failed to load
> ldlinux.c32.

What you're trying to do won't work in Jessie because of changes in
pxelinux that weren't completely adapted to in the new installer and
a regression in pxelinux that's present.

Explanation:

pxelinux is part of the syslinux family of boot loaders, that consists
of syslinux, extlinux, pxelinux and some others. Wheezy still comes
with syslinux 4.x, but Jessie comes with 6.0.3.

syslinux up to 4.x built separate binaries that contained copies of a
lot of code, for example also the code that parses the configuration
and shows the menu etc.

syslinux starting with 5.0 separated out the common code into a
separate binary called 'ldlinux.c32'. The individual other binaries
(pxelinux.0, syslinux.0, etc.) now only contain the code that is
required to load that binary. After loading that binary, they execute
it, which then takes care of everything else.

Problem: the pxelinux configuration file is read in by ldlinux.c32, but
pxelinux needs to find that file in order to load it. So the path
option in the configuration file can't be respected, because the
configuration file is not loaded yet.

The solution here is to put the ldlinux.c32 file from the
syslinux-common package into the same directory as pxelinux.0. Then the
path logic in pxelinux.0 can find it and everything should work. [1] If
you look at the debian-installer-8-netboot-$ARCH packages, you will see
that they do just that: ldlinux.c32 is supplied right next to
pxelinux.0.

Solution would then be to ask the debian installer team to add
ldlinux.c32 to the root directory on the mirrors. Hmmm, it appears that
there's already a bug related to that:
<https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=750586>
The fix that was applied changed the netboot.tgz to properly include
the file, but not the things added to the mirrors.

But even if that bug is fixed, life isn't that simple, it still won't
be enough for that use case: before 5.10 pxelinux couldn't actually
speak HTTP. But why did it work in Wheezy? Because pxelinux used the
still loaded gPXE/iPXE to perform the HTTP requests for it. In version
5.10 there was code added to natively support network protocols and now
(such as HTTP) and now a new binary 'lpxelinux.0' is created that does
support these natively. The old binary is still created with the legacy
code.

Unfortunately, the legacy version 'pxelinux.0' in 6.03 doesn't support
calling out to gPXE/iPXE anymore, and at least from playing around a
bit with the current source code, it seems that the code has been
disabled for a while and can't easily be enabled anymore, because other
code changes prevent the code from being properly compiled...

So that means that current 'pxelinux.0' will see the HTTP URI, notice
that it isn't TFTP (which it supports natively), won't have the code
to perform a call to gPXE/iPXE inside anymore (hence the regression),
and just plain fail. That is why you don't see any more network
requests to even try to load ldlinux.c32 (regardless of protocol),
because the code gives up because it doesn't understand 'http'. :-(

I've reported this in the Debian bugtracker as:
<https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=814459>

So the solution for now (without fixing pxelinux) would be to provide
lpxelinux.0 and ldlinux.c32 in the installer root (i.e. what you set
for option 210); then you could point your gPXE/iPXE script to
lpxelinux.0, it would find ldlinux.c32 in the proper place and
everything would work again.

I've attached a patch to the bug report
<https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=750586>
that does just that. (See the very bottom.)

Side note: it's not very nice to do this with an official Debian mirror
and constantly cause them traffic. If you don't want to have to sync it
manually and that's why you're using an official mirror, please take a
look at apt-cacher-ng: just install it somewhere (maybe on the PXE
machine), and then use http://$SERVER:3142/debian as the mirror. (Note
that /var/cache should probably have a couple of Gigs available.)

Regards,
Christian

Attachment: signature.asc
Description: OpenPGP digital signature


Reply to: