[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Lobby this, somebody

	Here follows a defense of the Debian way of doing things; the
 technical reason behid the decisions. This is a draft, and I think it
 is still overly verbose and repititive, but I think that the issue
 needs to be addressed right now.

	As I have mentioned elsewhere, /etc/debian_version,
 /usr/src/.linux-versions, and /usr/src/kernel-headers-2.0.32 (the
 last nemaed file name may change, it is correct as we speak)
 unequivocally determine debian system.



		 Kernel Headers and libc6-dev package

		    Need for kernel include files
                    ==== === ====== ======= =====
	Even though GNU libc 2.0 (a.k.a. libc6) provides an uniform
 interface to C programmers, one should realize that it needs
 different underpinnings on different architectures and operating
 system's (remember, glibc2 is multi-OS).

	glibc provides all the standard files that the C standard and
 POSIX require, and those in turn call in OS and platform specific
 headers as required transparently to the user. There is an a complete
 divorce of the kernel-level interface from the user-level interface:
 the application programmer does not need to know kernel level details
 at all.

	 But this has been taken by some to mean that
 /usr/include/{linux,asm} would be superfluous, which is a technical
 impossibility given that glibc2 is not an architecture and os
 specific library.

        I do not believe it is easy for glibc to present an interface
 that does not match the underlying OS, and quite possbly people just
 punted. If there is a mismatch between the user level structures and
 the kernel level structures, then libc6 library shall have to install
 translating wrappers around system calls (not such a great idea for
 high performance systems). I can forsee cases where it would not be
 possible to implement these wrappers, given a sufficiently large set
 of architectures and OS's.

	In the case of Linux, the kernel header files are the
 underpinnings of the architecture independent interface.

	Take a simple general ANSI C include file like <errno.h>. This
 in turn includes /usr/include/errnos.h, which includes
 /usr/include/linux/errno.h, which in turn includes
 /usr/include/asm/errno.h. See? A simple, standard include file like
 <errno.h>, and one needs kernel include files for that.

		   Traditional two symlink approach
                   =========== === ======= ========          

Under libc5, it was standard for part of the user interface to libc to 
be exported from the kernel includes, via /usr/include/linux and 
/usr/include/asm.  Traditionally, this was done by linking those two 
directories to the appropriate directories in /usr/src/linux/include.  
This is the method documented in the install instructions for the 
kernel sources, even today.

			   Why that is bad
                           === ==== == ===

However, kernel headers no longer make sense exporting to user space
(in early days of Linux, that was not true). However, it is beginning
to get harder to snchronise the libc and the kernel headers as in the
old days; and now linking with the latest kernel headers may subtly
break new code since the headers linked with are different from the
compiled library. Secondly, the necessity of not making programs break
was preventing experimentation in the kernel development itself (see
appendix A for details). The solution is to separate out the two sets
of header files: the kernel uses one, and libc uses another, well
known, stable set of headers which it itself has been built with.

			Debian's libc5 method
                        ======== ===== ======

        The headers were included in libc5-dev after a rash of very
 buggy alpha kernel releases (1.3.7* or something like that) that
 proceeded to break compilations, etc.  Kernel versions are changed
 far more rapidly than libc is, and there are higher chances that
 people install a custom kernel than they install custom libc.

        Add to that the fact that few programs really need the more
 volatile elements of the header files (that is, things that really
 change from kernel version to kernel version), [before you reject
 this, consider: programs compiled on one kernel version usually work
 on other kernels]. For the few that do need specific kernel headers,
 use -I/usr/src/kernel-headers-2.0.33 or some thing.

        So, it makes sense that a set of headers be provided from a
 known good kernel version, and that is sufficient for compiling most
 programs, (it also makes the compile time environments for programs
 on Debian machines a well known one, easing the process of dealing
 with problem reports), the few programs that really depend on cutting
 edge kernel data structures may just use -I/usr/src/linux/include
 (provided that kernel-headers or kernel-source exists on the system).

	It was decided, then, to have libc development package include
 files found one specific kernel-sources to insulate most systems from
 the vagaries of bleeding edge kernel sources, and to prevent subtle
 breakage introduced by the difference in the headers a program is
 compiled versus headers the libc was compiled with.

        Most programs, even if they include <linux/something.h>, do
 not really depend on the version of the kernel, as long as the kernel
 versions are not too far off, they will work. And the headers
 provided in libc5-dev (and libc6-dev) are just that. 

        libc5-dev is uploaded frequently enough that it never lags too
 far behind the latest released kernel. libc6 has totally disconnected
 the included headers from kernel headers.

        There are two different capabilities which are the issue, and
 the kernel-packages and libc{5,6}-dev address different ones:

 a) The kernel packages try to provide a stable, well behaved kernel
    and modules, and may be upgraded whenever there are significant
    advances in those directions (bug fixes, more/better module
    support, etc).  These, however, may not have include files that
    are non-broken as far as non-kernel programs are concerned, and
    the quality of the development/compilation environment is not the
    kernel packages priority (Also, please note that the kernel
    packages are tied together, so kernel-source, headers, and image
    are produced in sync)

 b) Quality of the development/compilation environment is the priority
    of libc{5,6}-dev package, and it tries to ensure that the headers it
    provides would be stable and not break non-kernel programs. This
    assertion may fail for alpha kernels, which may otherwise be
    perfectly stable, hence the need for a different set of known-good
    kernel include files.

		   Shortcoming of the libc5 method
                   =========== == === ===== ======

	Unfortunately, the peoposal for libc5 is beginning to unravel
 at the edges, since Debian is going truly multi-atchitecture (and there
 are murmurs of multi-OS Debian). 

	The reason I say that the solution is unravelling at the edges
 is that the kernel header files are getting to be quite architecture
 dependent. We still want to make libc6-dev include headers from a
 well known stable kernel source. But there is no one set of headers
 to install, since diffferent architectures need different (sometimes
 radically different) files. 

	If libc development packages continued to include kernel
 headers explicitly, we would need different headers for different
 architectures, resulting in libc6-dev-i386, libc6-dev-m68k, et. al.
 Or we could maintain all the files in libc6-dev as a set of (quite
 large) arch dependent patches. But those files are precisely the
 files contained in kernel-headers-2.0.32.

	Even in the case of more than one architecture sharing a
 common kernel-header file set, all I say is that this invariant is
 not guaranteed for any future releses. Kernel headers need not be the
 same for any two given architectures. Any solution would do well to
 address that.

	However, the following still remains true: the libc
 development packages should include a static, stable, tested, known
 good kernel headers to insulate developers from the vagaries of
 unstable kernel headers (and the subtler pitfall of compiling with
 headers that do not match the lic being linked with).

			Debian's libc6 method
                        ======== ===== ======

	The current solution is a variant of the original, bad,
 symlink solution. The variation is that we link to header files from
 a specific kernel version, namely 2.0.32, which are the headers that
 libc was compiled with. Whereas we used to link to any old kernel in
 the original libc5 symlink, the new libc6 symlink is to
 /usr/src/linux-2.0.32, which is a symbolic link that holds headers
 from a *static, well known*, *supported*, tested kernel version, and
 we let the kernel-headers package handle architecture dependancies
 (which it had to anyway).

	One should also consider that the libc maintainer has enough
 on their plates, and should not be expected to be responsible for the
 creation and testing of the arch dependent diffs.

	Coupled with the fact that the afore-mentioned diffs are
 exactly what is there in kernel-headers package, it exists already,
 and requires no work on the part of the libc maintainer.

        So, a dependency was cerated on kernel headers rather than
 create kernel header sized architecture dependent diffs. The
 dependency is such that it libc6-dev links to the directory
 /usr/src/linux-2.0.30. This link is provided by just the packages
 kernel-source-2.0.32 or kernel-headers-2.0.32, and *no other* kernel
 package! So, libc6-dev depends can now provide 2.0.32 kernel headers
 *for all architectures*, with no messy architecture dependent
 patches, we always have fixed, static, known good headers in
 /usr/include/{linux,asm}, and this is goodness.

        This is a working technical solution to having libc
 development package contain/depend on a well known static set of
 kernel headers (insulating the vast majority of programs that are not
 closely tied to kernel version specific internal data structures),
 while allowing the kernel headers to vary from architecture to
 architecture, and still allowing device driver authors from having
 any set of kernel headers they want on the machine through the simple
 artifice of adding a -I flag to the compilation flags.

	Indeed, had we thought of the current solution in libc5
 days it would have been cleaner (and produced less confusion now).
 The solution of having libc5-dev depend on kernel-headers-X.XX.XX
 wold have worked just as well in the libc5 case as it is in the libc6
 case.  I don't see a change in the situation, just a switch to an
 equivilant, possibly cleaner, solution.  If we had had libc5-dev
 depend on kernel-headers, then there wouldn't be the argument that
 there is now.  

  Authors response to complaints that Debian is doing it's own thing
  ======= ======== == ========== ==== ====== == ===== ==== === =====

       Yes, Debian is different. It has a package management
 sytem. Different part of Debian work with each other. There are
 asumptions part of the distribution make about itself, and because of
 this set of co-operating asumptions, or rules, or policy, Debian is
 better integrated than most Linux distributions I have seen. 

        We often do not do things the "standard" way. Like when we
 started including /usr/include/{linux,asm} directories instead of
 having symlinks. It is different. And was acknowledged to be
 technically superior. 

        One of the set of rules we follow is the File system
 standard. It helps us make things like, say, the menu system, or the
 web servers, more effective.

        Again, we are making changes that shall not be reflected in
 other distributions. I think we are making the technically superior
 choice. But we shall be different again.

        We have a mechanism in place that allows package management
 while keeping in mind dependencies and such: I suggest we use it. 

       I think this solution works, and is an elegant solution.


			      Appendix A
                              ======== =

        This document contains comments from Linus Torvalds (made in
 an ``off-the-cuff'' personal email) to help clarify the rationale
 behind the Debian way of handling symlinks, but this should not be
 seen as an official policy statement by Linus. I'm attaching a
 disclaimer in his own words.

        The only reason that Linus's message is quoted in here is that
 he can explain the technical reasons with far more lucidity than I
 can, and now that I have permission to include his mail, I am
 removing most of my far less facile efforts in that regard. 

	Need for isolating the C development Library from volatile
	kernel headers

>> "David" == David Engel <david@sw.ods.com> said on Mon, 24 Feb 1997
>> "Linus" == Linus Torvalds said on Mon, 24 Feb 1997

David> Hi Linus,
David> No matter how well we try to explain ourselves, the symlinks issue
David> keeps coming up.  Would you mind if we used your message below in
David> our responses?

Linus> Sure. Don't make it "the word of God" - please point out that
Linus> it was a off-the-bat personal reply to a question concerning
Linus> this, and while I'm more than happy to have the email
Linus> circulated it shouldn't be seen as a "official" document in any
Linus> way..
Linus> Linus

        The headers were included in libc5-dev after a rash of very
 buggy alpha kernel releases (1.3.7* or something like that) that
 proceeded to break compilations, etc.  Kernel versions are changed
 far more rapidly than libc is, and there are higher chances that
 people install a custom kernel than they install custom libc.

        libc6 includes it's own version of /usr/include/linux and
 friends form the beginning (that is, this is no longer a Debian only
 feature, the upstream version has moved to this scheme as well).

>> "Linus" == Linus Torvalds said on Wed, 22 Jan 1997:

Linus> The kernel headers used to make sense exporting to user space,
Linus> but the user space thing has grown so much that it's really not
Linus> practical any more. The problem with Debian is just that they
Linus> are different, not that they are doing anything wrong. That
Linus> leads to differences between the distributions, and that in
Linus> turn obviously can result in subtle problems.

Linus> As of glibc, the kernel headers will really be _kernel_
Linus> headers, and user level includes are user level
Linus> includes. Matthias Ulrich did that partly because I've asked
Linus> him to, but mainly just because it is no longer possible to try
Linus> to synchronize the libc and the kernel the way it used to
Linus> be. The symlinks have been a bad idea for at least a year now,
Linus> and the problem is just how to get rid of them
Linus> gracefully. Personally, I'm counting on glibc, which we are
Linus> already using on alpha.

Linus> Just to give you some idea of exactly why the includes really
Linus> can't be handled by simple symlinks: the main problem is
Linus> version skew. Lots of people want to upgrade their library
Linus> without affecting the kernel, and probably even more people
Linus> want to be able to upgrade their kernel without affecting their
Linus> compilation environment. Right now doing that has been
Linus> extremely fragile.

Linus> Just to give _one_ example of why the symlinks are bad: NR_OPEN
Linus> and "fd_set". I have had no end of problems making NR_OPEN
Linus> larger in the kernel, exactly _because_ of the damn
Linus> sym-links. If I just make NR_OPEN larger (the right thing to
Linus> do), the problem is that people with old libraries will now
Linus> compile against a header file that doesn't match the library
Linus> any more. And when the library internally uses another NR_OPEN
Linus> than the new program does, "interesting" things happen.

Linus> In contrast, with separate header files, this doesn't make any
Linus> difference.  If I change NR_OPEN in the kernel, the compilation
Linus> environment won't notice UNTIL the library and associated
Linus> header files are changed: thus the user will continue to compile
Linus> with the old values, but because we'll still be binary
Linus> compatible, the worst thing that happens is that new programs
Linus> won't take advantage of new features unless the developer has
Linus> upgraded his library. Compare that to breaking subtly.

Linus> NR_OPEN is just _one_ example, and actually it's one of the
Linus> easier ones to handle (because the only thing that really makes
Linus> much of a difference when it comes to NR_OPEN is the fd_set
Linus> size - but it certainly bit some people). Another major problem
Linus> is name-space pollution: the POSIX/ANSI/XOpen rules are not
Linus> only complex, but they are actually contradictory too. And the
Linus> kernel header files really can't reasonably support all of the
Linus> intricacies very cleanly.

Linus> One specific example of why we want separate header files for
Linus> libraries and kernel is offered by glibc: Matthias wanted to
Linus> have a "sigset_t" that will suffice for the future when the
Linus> POSIX.1b realtime signals are implemented. But at the same time
Linus> he obviously wants to be able to support programming on
Linus> Linux-2.0 and the current 2.1 that do not have that support.

Linus> The _only_ reasonably clean way to handle these kinds of
Linus> problems is to have separate header files: user programs see a
Linus> larger sigset_t, and then the library interaction with the
Linus> kernel doesn't necessarily use all of the bits, for
Linus> example. Then later, when the kernel support is actually there,
Linus> it's just a matter of getting a new shared library, and voila,
Linus> all the realtime signals work.

Linus> The symlink approach simply wouldn't work for the above: that
Linus> would have required everybody who uses the library to have a
Linus> recent enough kernel that whatever magic all the above entails
Linus> would be available in the kernel header files. But not only
Linus> don't I want to pollute the kernel header files with user level
Linus> decisions, it's actually possible that somebody wants to run
Linus> glibc on a 1.2.x kernel, for example. We _definitely_ do not
Linus> want him to get a 32-bit sigset_t just because he is happy with
Linus> an old kernel.

Linus> Anyway, this email got longer than intended, but I just wanted
Linus> to make clear that the symlinks will eventually be going away
Linus> even in non-Debian distributions. Debian just happened to do it
Linus> first - probably because Debian seems to be more interested in
Linus> technical reasons than any old traditions. And technically, the
Linus> symlinks really aren't very good.

Linus> The _only_ reason for the symlinks is to immediately give
Linus> access to new features in the kernel when those happen. New
Linus> ioctl numbers etc etc. That was an overriding concern early on:
Linus> the kernel interfaces expanded so rapidly even in "normal"
Linus> areas that having the synchronization that symlinks offered was
Linus> a good thing.

Linus> However, the kernel interfaces aren't really supposed to change
Linus> all that quickly any more, and more importantly: the technical
Linus> users know how to fix things any way they want, so if they want
Linus> a new ioctl number to show up they can actually edit the header
Linus> files themselves, for example. But having separation is good
Linus> for the non-technical user, because there are less surprises
Linus> and package dependencies.

Linus> Anyway, something like the patch that David suggested will
Linus> certainly go in, although I suspect I'll wait for it to become
Linus> "standard" and the glibc first real release to take place.

 Perfect of virtue, always acting with recollection, and liberated by
 final realisation - Mara does not know the path such people
 travel. 57
Manoj Srivastava  <srivasta@acm.org> <http://www.datasync.com/%7Esrivasta/>
Key C7261095 fingerprint = CB D9 F4 12 68 07 E4 05  CC 2D 27 12 1D F5 E8 6E

TO UNSUBSCRIBE FROM THIS MAILING LIST: e-mail the word "unsubscribe" to
debian-devel-request@lists.debian.org . 
Trouble?  e-mail to templin@bucknell.edu .

Reply to: