Dependencies on shared libs, take 2
[Bcc on firstname.lastname@example.org so that discussion happens on -devel]
I've gone forward with the plan that I exposed in
Please grab the code with:
$ bzr get http://bzr.debian.org/private/hertzog/shlibdeps/
The repository contains two scripts: a new "dpkg-gensymbols" that is used
to generate "DEBIAN/symbols" file during the build process and a
replacement for "dpkg-shlibdeps" that uses symbol files to generate
dependencies and normal shlibs if there's no corresponding symbols file.
If you want to try it out, you can install it:
$ sudo make install
(It will crudely copy files in the system but make uninstall will remove
them and bring back the system to its previous state)
How does it work to generate a dependency
dpkg-shlibdeps works as usual except that instead of looking at *.shlibs
file, it first tries to find *.symbols file. Then for each ELF binary
it will generate the list of dynamic symbols that it uses. For each
symbol, it will go through the list of libraries (in the same order as
they are referenced in the binary) and try to find the symbol in the
corresponding symbol file. If yes, it checks the minimal version of the
library which provides it and compares/updates the minimal version needed
by the whole package. If the symbol is not found in a *.symbols file, then
we check against the libraries which are listed by the binary but for
which we haven't found an *.symbols file. If the symbol is present in one
of those libraries, then we record the dependency indicated by the
corresponding shlibs file. If the symbol is found nowhere it displays a
warning (maybe it should fail?).
At the end, it computes the resulting dependency.
dpkg-shlibdeps will use symbols file available in /etc/dpkg/symbols/. So
if you want to try it out without recompiling many packages, you can
simply generate the symbols file that you want and put them in this
Checking what it generates is easy enough:
$ dpkg-shlibdeps -e/bin/ls -O
shlibs:Depends=libselinux1 (>= 2.0.15), libc6 (>= 2.3.6.ds1-13), libacl1 (>= 2.2.11-1)
In this sample I only have installed a symbols file for libc6:
What does it mean for library maintainers
Library maintainers are supposed to maintain the *.symbols file. For
this, they have to create files "debian/<package>.symbols.<arch>"
(dpkg-gensymbols will try too fallback to "debian/symbols.<arch>",
"debian/<package>.symbols" and "debian/symbols"). They are
required to provide the minimal version (as used in the dependency
generated) associated to each symbol.
Then during the build process, dpkg-gensymbols will use those symbols file
and merge information concerning newer symbols provided by the library.
The result is provided inside the package itself as a DEBIAN/symbols file.
The canonical way to call dpkg-gensymbols during a build is:
dpkg-gensymbols -p<package> -P<packagebuildtree>
(the version is extracted from the changelog, and all the libraries found
in the <packagebuildtree> are scanned)
If you want to explicitley list the libraries that will be scanned, then
you can pass several -e<library-file> (you can use glob expression like
Library maintainers who want to avoid any mistakes can use the "-c" option
(for compare) which will make the compilation fail if the generated
symbols file differ from the maintainer supplied file. In that case, the
build log contains a diff between the two symbols files and he can analyze
the differences (and update his file if necessary).
Creating a first version of the symbols file is not difficult either. For
the sake of example, here's how I did with the libc6 package. I included
the etch package first so that I have history of symbols starting from
$ aptitude download libc6/stable libc6/unstable
$ dpkg -x libc6_2.3.6.ds1-13_i386.deb /tmp/etch-libc6
$ dpkg -x libc6_2.5-9_i386.deb /tmp/sid-libc6
$ dpkg-gensymbols -v2.3.6.ds1 -plibc6 -e/tmp/etch-libc6/lib*.so* -Olibc6.symbols
$ dpkg-gensymbols -v2.5-9 -plibc6 -e/tmp/sid-libc6/lib*.so* -Olibc6.symbols
Note that -P/tmp/etch-libc6 should have been enough but since the etch
package of the libc6 contains multiple versions of the same shared libraries
I had to specify precisely which files I wanted to scan with -e.
Note also that you should do that for all architectures in case symbol
information differ from on arch to the other. Since this is painful, I'll
try to generate files ready to be downloaded (see below). If you know that
there's no difference between architectures, you don't need to bother but
using -c during build will help you ensuring that you were right and that
there's indeed no difference.
Since symbol information is integrated in the package itself, a "debdiff
--controlfiles ALL" would directly show if a package introduces new
symbols or removes existing ones.
What comes next
Up to now, I only tested those scripts on a few packages. What comes next
is some archive-wide work:
- I want to generate ready-to-use symbols file for all libraries
initialized with packages from etch and then kept up-do-date with all
revisions uploaded to unstable.
I'll probably hook that into Mole (http://wiki.debian.org/Mole)
- I'll work with Lucas Nussbaum to do a full rebuild of the archive
with those symbols file. Then I'll try to check out how many
sid packages could be installed on etch, or could directly migrate to
testing. If anyone familiar with britney would volunteer to do that for
me, that would be awesome.
This archive-wide rebuild will also let me discover a bunch of bugs that
I can then fix, so that the tool will be quickly reliable. I believe it's
trivial to integrate this work in dpkg-dev and I hope this will happen
once all those validation steps have been over.
Tests, feedback, bugreports and comments welcome.
Premier livre français sur Debian GNU/Linux :