[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Summary of work so far



OK, I'm going to attempt to summarise various discussions and threads that have been ongoing since LinuxWorld Expo in Olympia, between myself, Wookey, Hector and Julian Gilbey (devscripts maintainer).

(It's a very long email, sorry.)

(Julian: forgive me if you are now subscribed, I've CC'd you to be safe.)

Note: The emdebian website and parts of the Wiki have not yet been updated to reflect this summary. (Next job.)

1. emdebian-tools
=================

A set of scripts and helpers that seek to automate the emdebian scheme. Including emchain and emlocale so far, plus dpkg-source-emdeb (Julian) and an as yet unwritten pair of wrapper scripts that will "emdebianise" debian packages and handle the tweaked emdebian packaging, plus a few more that haven't been thought of yet.
:-)

emdebian-tools will be an umbrella package for all the em* tools and will depend on everything that is needed to build emdebian packages (pbuilder, svn, devscripts, dpkg-dev, cdbs, . . . ).

Developers should have an intuitive build experience when crossing to and fro between Debian and Emdebian. Debian packaging is very flexible but it does have a learning curve. There is no point adding to that curve. The tools are (or should be) sufficiently powerful to take the "line of least surprise".

sudo apt-get install emdebian-tools
apt-get source foo
cd /path/to/debian/foo/foo-1.2.3
sudo empdebuild
dput emdebian ../foo_1.2.3-1em1_$(ARCH).changes

empdebuild would need a separate --basetgz that ignores the 'essential' package list from Debian and uses only glibc, busybox-full, dash, initscripts, tinylogin and dropbear/minibase, although some others may be needed. (No testing of this has taken place yet.) [1]

Then use the toolchain created (and maintained) via emchain in a chroot with the modified svn-buildpackage (svn-buildemdeb?) that implements (what will be) Emdebian Policy on the lack of /usr/share/doc/, the handling of translations, the lack of 'essential' packages, the emN, the suffix and finally the creation of a multiple diffs, the expanded .dsc, a suitable changes file and the packages themselves.

In most cases, a single Debian package is likely to result in several Emdebian packages that - combined - still occupy less space than the one original Debian package. Users will be able to install only the locale packages for their preferred language(s), instead of every possible translation, allowing a thirty-fold reduction in the size of some Debian packages on the target system. After all, why does someone who does not speak spanish want dozens of spanish translations installed on what is generally a personal - single-user - device? Each translation file can be 300Kb and Debian packages may have >70 translation files each. Additional languages can be added individually, if necessary, via a set of virtual locale packages that list each individual translation for that language. This follows the method used by OpenEmbedded.

The wrapper will support running a filtered lintian at the end of the packaging build and as a stand-alone task.

Something like `lintian -X cpy,men` and possible dfmt

Oops: lintian can't cope with a .emdeb -
$ lintian foo_1.2.3-1em1_all.emdeb
internal error: bad package file name foo_1.2.3-1em1_all.emdeb
(neither .deb, .udeb or .dsc file)

Instead call lintian *before* the rename - until we are confident that this whole scheme works and we can ask lintian to support .emdeb, at which point we can ask that such support implies the list of excluded checks that we end up using.

(foo in the above example is a simple Hello World dash package, it is not arch _all_ for any particular reason other than it is simpler to test, at this stage, with a non-compiled package.)

[1] http://wiki.debian.org/NeilWilliams

2. .emdeb
=========

Possible (likely?) suffix for emdebian binaries.

Unlike $DEBIAN_DIR, this can be used to prevent the installation of .emdeb alongside .deb on the target system. Package conflicts would be likely if the two are combined because new packages are generated from existing Debian packages.

The automated toolchain and pbuilder handling will make it easier to build emdebian-specific packages of all supported Debian packages. Hector's Emdebian BTS could be used in a similar fashion to wnpp to handle requests for additional Debian packages to be supported.

Together with the emN suffix, .emdeb achieves three distinct ends. [2]

1. Separates the emdebian archive files from other Debian archives and associated files being built from the same source tree on the developer system. 2. Identifies the archive as an adapted, stripped out, archive that is not necessarily compatible with any other .deb or .udeb files or programs. 3. Allows emdebian developers to have a version number that is independent of the full Debian version string.

On the other hand we should consider whether we really need a new
suffix. The resulting files are no different from debs or udebs so far
as tools that process them are concerned. The emN version may be sufficient for distinguishing purposes. There are a lot of
file-management tools which would need to be told about a new
extension and we already have deb, udeb, and ipkg (which are all the
same in important ways).

udebs are called that because they don't conform to policy. Emdebian
debs will be identical to these in output, just not in construction
method. Is it too confusing to re-use the .udeb suffix? There are possible problems about udeb assumptions in d-i, but it is unclear if
they actually exist in practical terms. Of course emdebian debs will
conform to emdebian policy (once we've written one :-) which may
differ from udeb 'policy' a bit and Debian Policy a bit more.

[2] http://wiki.debian.org/EmdebianMetaData

3. .em.diff.gz
==============

Separation of the diff between the Debian version and the emdebian version. (Think of it as an interdiff between the .deb and the .emdeb)

Scripts from emdebian-tools are expected to do the majority of the work - including updating emdebian packages when a new Debian upload is available. It may be possible to "emdebianise" some packages without having to edit files directly - although many will need some form of optimisation.

4. .emN
=======

The emdebian version suffix.

1) Debian is the upstream source of Emdebian.
2) When Debian's upstream source releases a new version, Debian versions always reset to 1

An emdebian numbering system for a native package would be:

foo-1.2.3em1.em.diff.gz
foo_1.2.3.tar.gz
foo-1.2.3em1_arm.changes
foo-1.2.3em1_arm.emdeb
foo-1.2.3em1.dsc

A non-native would include
foo-1.2.3-4.diff.gz
foo-1.2.3-4em1.em.diff.gz
foo_1.2.3.orig.tar.gz
foo-1.2.3-4em1_arm.changes
foo-1.2.3-4em1_arm.emdeb
foo-1.2.3-4em1.dsc

(because we first patch the upstream to create a debian package, then emdebianise that package with the .em.diff.gz) (dpkg-source-emdeb)

That also has the benefit that an emdebian developer can get both source trees by having Debian and Emdebian repositories in the sources.list by running apt-get [em]source in different directories.

emN protects ordinary debian build files that are built from the same source tree because all the debian scripts are designed to ignore files that have a modified version string in the name.

As the .dsc has the second .diff.gz listed, the mismatch in the version string prevents unpacking with dpkg-source, preventing a mangled build.

Q: What happens if for some bizarre reason a package has emN at the end of its version number?

A: I guess there's nothing to stop upstream using 1.2.em1 in their version string but the emN is appended after the DEBIAN version string and the scripts can be made to check only for emN as the last three characters of the Debian version string. Debian is the upstream for emdebian, what Debian's upstream do isn't anything that should trouble emdebian.

Example: foo-1.2em4 is released at SourceForge or wherever.
Debian packages this, as a new upstream release, with the version string: 1.2em4-1. Emdebian then strips out the unwanted files and packages it as 1.2em4-1em1

The scripts can be coded to only care about emN at the end of the version string, specifically, AFTER the hyphen and the Debian version.

Version sequence:
Upstream	Debian	Emdebian
1.2
		1.2-1
			1.2-1em1
			1.2-1em2
			1.2-1em3
		1.2-2
			1.2-2em1
			1.2-2em2
		1.2-3
			1.2-3em1
1.3
		1.3-1
			1.3-1em1
			1.3-1em2
		1.3-2
			1.3-2em1


5. emchain automation
=====================

1. Ensure the dpkg-cross/apt-cross package cache is up to date using the timestamp already in place. Update it if not. 2. Read the cache to retrieve the highest available version number of specific toolchain packages: binutils, gcc-N.N, libcN. (note that gcc-5.6 is already accepted, as is libc8 etc. in the package name as long as N.N and N, respectively, remain valid numerical expressions with a numeric result. Any valid Debian version is accepted, it's the package name changes that needed the code tweak.) 3. Compare those numbers with the versions available in the current directory (the toolchain build directory) - and/or possibly against the versions returned by dpkg -l. Further compare the actual Debian versions against the existing build in the same manner. 4. If all are missing (i.e. user forgot to use --create and went straight for --build), schedule a get, build and install.
5. If one is out of date, schedule a build for that one.
6. If >1 is out of date, I was going to schedule a build for each in turn in the same sequence as originally built but, yes, there are unpredictable dependency problems.

Code already exists to do stages 1, 2 and 3, in a form that could be called by cron-apt.

6. debian/changelogDebian && debian/changelog
=============================================

When "emdebianising" a Debian package and when updating an emdebian package with changes from the new Debian package upstream, move the standard debian/changelog to debian/changelog.Debian and then have debian/changelog be the emdebian changelog.

This keeps the Debian changelog untouched, while allowing all
the standard tools to work unchanged as far as the changelog is
concerned.

Debian		(Buildd)	Emdebian
-------		------		---------
1.2-1
				1.2-1em1
				1.2-1em2
1.2-2   	(FTBFS)
1.2-2.1 	(NMU)
				1.2-2.1em1
				1.2-2.1em2
1.3-1
				1.3-1em1

By replacing the changelog in the same wrapper that "emdebianises" the Debian source tree using emlocale etc., dpkg-buildpackage (and svn-buildpackage that uses it) should be usable directly.

Think of the wrapper as a kind of "dh_make" but which is intelligent enough to also be "dh_update" - a bit like a merge of dh_make and uupdate. (IIRC). The first task is "emdebianising a debianised tree" - equivalent to dh_make debianising the upstream tree. The second task is maintaining those changes for future versions once dpkg-source-emdeb has unpacked it. This involves checking for new translations, new docs, new packages etc. as well as failed patches.

The .em.diff.gz would be much larger than other methods but that's a repository issue, not a packaging one (and has zero effect on the target installation which is important). At least the em.diff.gz would be clear.

Then at the end of dpkg-buildpackage, the wrapper kicks in again to do:
mv *.deb *.emdeb
(although not as crudely as that! It'd work via the .changes file to determine which files to rename - mv doesn't change the md5sum so the .changes can have the suffix updated just before calling debsign.)

===========================================

I'm sure I've missed bits out and there are lots of issues remaining that haven't even been considered yet, but this is - I hope - a fair summary of current status.

--

Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/

Attachment: pgpBzVpYpZvIm.pgp
Description: PGP signature


Reply to: