[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Symbol-based dependencies on shared libraries: some news



Hello,

I'd like to give some news on my work. I've been working with the dpkg
team to prepare the integration of the new dpkg-shlibdeps... and during
that process we decided that a decentralized VCS like git would suit
better the need of the team (in particular so that Ian Jackson can more
easily maintain his Ubuntu branch, and merge changes in both directions).
So I've been distracted by the work of creating that git repository
because we integrated as much history as possible (the SVN repo had no
previous history). This is now over:
http://git.debian.org/?p=dpkg/dpkg.git;a=summary

My work is now maintained in the "dpkg-shlibdeps-buxy" branch in the
official git repository:
http://git.debian.org/?p=dpkg/dpkg.git;a=shortlog;h=dpkg-shlibdeps-buxy

We used that merge opportunity to start some clean modularization of the
perl code, Frank also created a first test suite to avoid regressions.
I updated the manual page for dpkg-shlibdeps and wrote a new one for
dpkg-gensymbols. The dpkg-gensymbols command now scans only public library
paths instead of scanning everything.

To fetch the code, please do (with git 1.5.x):
$ git clone git://git.debian.org/git/dpkg/dpkg.git
$ cd dpkg
$ git checkout --track -b dpkg-shlibdeps-buxy origin/dpkg-shlibdeps-buxy

You can build the package and install it to play with. You can integrate
dpkg-gensymbols calls in a debian/rules file, it will typically look like
this:
dpkg-gensymbols -plibcurl3 -Pdebian/libcurl3
dpkg-gensymbols -plibcurl3-gnutls -Pdebian/libcurl3-gnutls
(you should also create debian/<package>.symbols file, see below
to grab some pre-generated files)

If you want dpkg-shlibdeps to generate smaller dependencies, you have
to provide some useful symbols file in /etc/dpkg/symbols/
(some up-to-date files here: http://users.alioth.debian.org/~hertzog/symbols.tar.bz2
they were generated by scanning successively etch/lenny/sid packages)

The only open issue that I'd like to investigate is how to best handle
differences between various architectures. Right now you have to provide
a single file for each architecture and in many cases, the files are very
similar except for some internal symbols.

For example compared to i386:
* hppa always add
  * _GLOBAL_OFFSET_TABLE_
  * __gmon_start__
* arm always add
  * __bss_start__
  * __data_start
  * _bss_end__
  * __end__
* powerpc
  * _restfpr_14 (and many similar)
  * _savefpr_14 (same)
* etc.

Most C libraries only have those kind of differences between the various
architectures. The glibc is a notable exception since the history of each
port is different and thus symbol versions are not synchronized between
architectures.

It looks like C++ generates much more difference (I don't know why,
possibily due to encoding of some type information in the symbol name).
Since the symbols are also much longer, the symbols file tend to get
quickly quite big.

Knowing those differences, I wonder if I should offer the possibility to have
debian/<package>.symbols.common that would complement what can be found in
debian/<package>.symbols.<arch> or if we need something more elaborated like
an include mechanism or a syntax to restrict a symbol on a set of architectures
(like for dependencies in Build-Depends). Please give me your opinion on this.

If you want to investigate further the differences between architectures, you
can get the full set of symbols files from yesterday on all
architectures here:
http://users.alioth.debian.org/~hertzog/symbols-all-archs.tar.bz2
(beware it's big: 123Mb)

You can also log in alioth and check ~hertzog/shlibdeps/{sid,reference}/*
(reference contains the symbols files generated by scanning successively
etch/lenny/sid, while sid contains files generated by scanning only the
last version of the packages)

Obviously, maintainers will have to decide if using symbols files is more
a benefit than a cost for their packages... packages with few reverse
dependencies written in C++ and exporting lots of symbols are likely
to not use symbol-based dependencies for the simple reason:
$ ls -s libopenvrml5c2a.symbols.i386
5016 libopenvrml5c2a.symbols.i386
$ gzip libopenvrml5c2a.symbols.i386
$ ls -s libopenvrml5c2a.symbols.i386.gz 
192 libopenvrml5c2a.symbols.i386.gz

The .deb file would be bigger of 192Kb but the Installed-Size of the
package would be 5Mb bigger for little gains...

For reference, the biggest symbol file is libgcj7-0.symbols.ia64 and is
10Mb big. On i386, out of 2405 symbols file:
- 33 packages have symbols file bigger than 1Mb 
- 310 packages have symbols file bigger than 100Kb
- 1514 packages have symbols file smaller than 20Kb

Cheers,
-- 
Raphaël Hertzog

Premier livre français sur Debian GNU/Linux :
http://www.ouaza.com/livre/admin-debian/



Reply to: