[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Adoption of Nix?



  Hi there.  As I'm sure everyone knows, I'm not exactly unbiased here
since I've done a lot of work on the apt system (although nix looks more
like a replacement for dpkg).

  This is the same package manager that was posted on lambda-the-ultimate
a while back, right?  Since you didn't provide a link, I'll provide one.
According to Google, we're talking about this:

    http://nixos.org

  My first impression from reading blurbs on news sites was that they
either found some seriously deep magic, or they're ignoring a lot of
the practical issues that are involved in managing packages within a
large Linux distribution -- and I suspect the latter rather than the
former.  As an academic research project, they are within their rights
to do that, and for some use cases it doesn't matter, but it doesn't
mean we should adopt their software in Debian.

  To take a few excerpts from their site:

====
Nix stores packages in the Nix store, usually the directory /nix/store,
where each package has its own unique subdirectory such as

/nix/store/r8vvq9kq18pz08v249h8my6r9vs7s0n3-firefox-2.0.0.1/
====

  Never mind that this breaks the FHS -- I'll pretend for now that
we've amended policy to allow this, or that we've stuck it in /var
with some horrible bind-mounting to make it appear in the right places.

  That's a terrible user interface decision!  This is Unix, and
filenames are part of the user interface.  That file name, aside from
breaking all user expectations (as per my note about the FHS), is
completely unmemorable, means that packages with the same name aren't
necessarily sorted together in directory listings, breaks tab-completion,
and includes a long string of (to the user) meaningless gobbledygook.
At the very *least*, you should put the package name first, to fix the
tab-completion and sorting problems:

/nix/store/firefox-2.0.0.1-r8vvq9kq18pz08v249h8my6r9vs7s0n3

  but then what if I have two firefox-2.0.0.1s installed?  How do I
know which one is which?

  I hope nix at least has stow-like abilities to create a unified /bin
directory, but that doesn't help when you want to track down the files
of a program for whatever reason.

====
Multiple versions

You can have multiple versions or variants of a package installed at the
same time. This is especially important when different applications have
dependencies on different versions of the same package — it prevents the
“DLL hell”. Because of the hashing scheme, different versions of a package
end up in different paths in the Nix store, so they don’t interfere with
each other.
====

  That's fine as long as your hard drives (never mind the flash devices
embedded systems use, where dpkg is already painfully heavy) are infinitely
large, or you don't install very many versions of very many packages.  The
thought of doing this to track "unstable" terrifies me; I suspect that even
a large hard drive would fill up a few weeks, months tops.  And you can't
automatically purge versions, because you never know which ones a user
wants.

  Presumably there's a way to turn this off.  In fact, I would expect
that we *would* turn this off by default, with a manual override for
particular packages, if we used it in Debian, because I can't see it
being usable for a whole distribution otherwise.  On the other hand:

====
An important consequence is that operations like upgrading or uninstalling
an application cannot break other applications, since these operations
never “destructively” update or delete files that are used by other
packages.
====

  That sounds like they haven't thought hard about the problems around
upgrades and removals, which are not trivial.  (there's a research team
at the University of Paris they could talk to about this, if they
haven't already)  Because of that, I suspect that we *can't* disable
the "install multiple versions" feature -- it sounds like the package
manager fundamentally relies on this to do anything.

  In addition to my earlier comments: what if I have multiple Web
servers or database servers installed (or multiple versions of one of
them)?  Which one runs at startup, and what if I have packages that
specifically depend on another one?

====
Complete dependencies
====

  As other people have written, their claims are at best overblown.
e.g., while it can tell that I use Python, there's no possible way
it can tell which versions of Python I'm compatible with.  It also
sounds like maybe they bind programs to the exact binary of the library
they're built with, which would mean that you have to rebuild all the
reverse dependencies of a library every time you rebuild the library!
(that's so outrageous that I'm sure I must just be incorrectly
extrapolating from the summary on their Web site)

  Also: what about programs that refer to a file, but can function
without it?  This seems to have the same problem as other harnesses
that "automatically" detect dependencies through file access, in that
it will see a program probing for some functionality and assume that
it requires that functionality.

  In short: they have a clever way of detecting dependencies, that has
have a different combination of advantages and disadvantages compared
to our own system.  I can't say which one is better without reading up
more thoroughly on what they do, but unless they have pure magic (not
to mention some heavy-duty static program analysis from machine code)
there's no way they can eliminate the need to manually add or tweak
dependencies.  Our dependency detection works in most of the cases where
you could reasonably expect it to, and it doesn't seem to be burdensome
in practice for package maintainers to add the missing dependencies.

  I can't remember the last time I saw a stable Debian package with
incomplete dependencies, and even in unstable it seems very rare these
days.

====
Garbage collection

When you install a package like this…

$ nix-env --uninstall firefox

the package isn’t deleted from the system right away (after all, you might
want to do a rollback, or it might be in the profiles of other users).
====

  That's great, as long as they include the ability for me to say "I
really want to remove firefox, please show me all the packages that are
keeping it on my system."  Of course, we already have that in Debian.

====
Multi-user support

Starting at version 0.11, Nix has multi-user support. This means that
non-privileged users can securely install software. Each user can have a
different profile, a set of packages in the Nix store that appear in the
user’s PATH. If a user installs a package that another user has already
installed previously, the package won’t be built or downloaded a second
time. At the same time, it is not possible for one user to inject a Trojan
horse into a package that might be used by another user.
====

  I like this, and I've thought from time to time about how we could
theoretically integrate it into dpkg.  I'm a little worried about the
interaction with garbage collection: it sounds to me like a normal user
can DoS the system by depending on someone else's package so that it
never gets garbage collected.

====
Atomic upgrades and rollbacks

Since package management operations never overwrite packages in the Nix
store but just add new versions in different paths, they are atomic. So
during a package upgrade, there is no time window in which the package has
some files from the old version and some files from the new version — which
would be bad because a program might well crash if it’s started during that
period.
====

  I have to say: this is a very nice feature, and the one thing I see
that would be worth thinking about overhauling our package system to
get.

  However, the language above is somewhat misleading, on two points.
First, I think it overstates the problems that can occur due to a
partially installed upgrade.  That's certainly a theoretical
possibility, but dpkg goes to a lot of trouble to minimize the time
window during which an inconsistent package is visible.  In practice,
I've never seen this happen.  What I *have* seen is packages failing
between being installed and being configured, which gets me to my
second point:

  Rolling back software upgrades isn't as easy as just reverting the
files.  Many packages need some amount of post-install configuration,
and this can't be done in an acceptable way without breaking atomic
upgrades and rollbacks.  Typical examples include:

   1. Upgrading configuration files or databases to new formats.
   2. Creating users in the system password file.
   3. Starting or stopping daemons.
   4. Running programs to initialize system-wide caches of data
      so they are available when the program runs.

  Take (1), for instance.  Where are the package's configuration files
and databases stored?  I'll take databases as an example, but most of
these comments (except the size-related ones) apply to configuration
files as well.  If they're installed inside the package, then

  (a) Different versions of the package will have different databases,
      even if they're compatible!  This might be a feature, but then
      again, it might be a bug (what if I have two programs configured
      to use different builds of a library that has a systemwide
      database -- one of them can put stuff into the database and it
      won't be visible to the other one).
  (b) Rolling back to a previous version of the package will *destroy*
      any data you've added since the upgrade (obviously unacceptable).
  (c) You'll need a way to copy the database on upgrade, or upgrades
      will lose all the user's data (again, obviously unacceptable).
      The upgrade process will also have to upgrade the database format
      to the new version.  You also have to use disk space to store
      multiple copies of the database (which might be substantial).

  If the databases are installed systemwide, on the other hand, then
you cannot perform the upgrade atomically when the database format
changes: at some point, either the new version of the program will be
installed and the systemwide database will have the old version, or the
old version will be installed and the systemwide database will have the
new version.  You might say that this isn't really a problem, in which
case I will point out that the same applies to how dpkg does things. :-)

  The only way to deal with (b) is to take the data from the new version
and somehow inject it into the old version, but this raises its own
problems.

  (i)   If the user has relied on the "multiple package versions"
        feature of nix and has different configurations and/or database
        contents in the different package versions, you've just wiped out
        or otherwise messed with the configuration / data of the old
        version.
  (ii)  If the upgrade required a database or configuration file format
        upgrade, then you need to "downgrade" the contents of the
        database or configuration file.  But this isn't possible in
        general: first, it is often not possible *in principle* because
        the new format contains features that cannot be expressed in the
        old format.  Even if that's not an issue, upstream authors will
        often provide upgrade scripts but not downgrade scripts, or
        they'll only support downgrades to particular versions of the
        program.

        Also, who performs the auto-downgrade?  A package script has to
        do it (there's no way the package manager can know about the
        specific procedure for downgrading every configuration file and
        database on the system), and that means that the package install
        script has to contain information about how to downgrade to any
        version that the user could possibly have on his or her system.
        In practice, you would end up just supporting downgrades to just
        a few recent versions.
  (iii) Unlike upgrade scripts, downgrade scripts are likely to be
        poorly tested and unreliable.  For that matter, *upgrade*
        scripts from more than a few versions back are often not well
        tested; that's one of the reasons we recommend that people
        upgrade only between consecutive Debian releases (there just
        aren't enough people looking at what happens when you skip
        releases, especially more than one).

  I won't bore you with 2-4, and anyway this is the most intractable
of the problems.

  Please understand, *this* is the reason that dpkg and apt do not
officially support downgrades.  Replacing files with earlier ones is
trivial, and dpkg is happy to do that all day for you.  We already store
snapshots of old .debs online, and we could easily arrange for apt to
download them and/or to archive every version you install for future
rollback.  The problem is that it's very difficult to reliably downgrade
packages *in a manner that automatically results in them working*.

====
Functional package language

Packages are built from Nix expressions, which is a simple functional
language. A Nix expression describes everything that goes into a package
build action (a “derivation”): other packages, sources, the build script,
environment variables for the build script, etc. Nix tries very hard to
ensure that Nix expressions are deterministic: building a Nix expression
twice should yield the same result.
====

  Ultimately, in order to build a program, you have to invoke external
processes (unless Nix includes a C compiler), so you're only
deterministic in the sense of "except the changes introduced by running
random stuff as part of the build process", which is to say not entirely.
Pedantry aside, I think this might still be a useful piece of technology
(just going by the above description), but I don't see a really
compelling advantage over the build system we use now.  Also, the only
place where we currently have nondeclarative stuff in the Debian build
process is the shell snippets in debian/rules; everything else is
declarative.  (some of our build systems go further than that)

  I think the confusion of trying to introduce a new build system would
outweigh any marginal benefit from the improvements they've made, but
again I haven't read the details, just the summary.  Of course, you
could always introduce this the same way that dbs and the various other
flavor-of-the-year build systems have been introduced; we have enough
confusion there already, why not increase it a little more?

====
Transparent source/binary deployment

(snip description of gentoo-like abilities)
====

  I think that's an interesting idea, but I suspect it would be less
work to build this capability into apt than to completely replace our
entire packaging infrastructure.

====
Binary patching
====

  Same comment as above: nice feature, but it's easier to extend the
existing infrastructure.

====
Nix Packages collection

We provide a large set of Nix expressions containing hundreds of existing
Unix packages, the Nix Packages collection (Nixpkgs).
====

  We provide a large set of Debian source packages containing tens of
thousands of existing Unix packages, the Debian package archive. (SCNR)



  With all that said, I think that Nix is a useful program, both because
it explores some new ideas in package management, and also because I can
see it being very useful in some niche areas where current package
managers are lacking.  User-level package management is a great thing in
some environments and this seems like$ a decent solution to that problem
(the hard part is getting the packages to dynamically adapt to their
install directory, but the way they have things set up, you have to do
that anyway).  It also looks like a great way for ISVs to distribute
their software: it's distribution-independent, and it can apparently
guarantee that you bind against exactly the libraries you QAed against.
Won't help with the kernel, X server, etc, but it avoids most of the
versioning issues that can bite commercial software.

  In general, it looks like a nice way of handling binary packages that
aren't part of the core distribution, without the disgusting hacks you
see in things like autopackage.

  For Debian-wide use, though, I don't think that Nix is worth the
trouble.

  Daniel


Reply to: