[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Thoughts about tdebs



On Wed, 3 Dec 2008 09:12:46 +0100
Raphael Hertzog <hertzog@debian.org> wrote:

BTW, it was Eddy Petrisor, not Eric - typo. Sorry Eddy.
(Thanks, Christian.)

> On Tue, 02 Dec 2008, Neil Williams wrote:
> > > It is a very sizable special case, though, and growing thanks to
> > > the tireless translation efforts within and beyond Debian. This
> > > is why the consensus at the meeting seemed to be to get this
> > > right even if it means having to implement some special casing.
> > 
> > Raphael - I think you've missed some of the purpose behind TDebs
> > and if that is a fault of the current draft TDeb spec, then I would
> > value your comments on how to fix it.
> 
> I hear the size problem that Emdebian is facing but Tdeb is not the
> right solution for this either IMO. We should rather aim at having
> some DEB_BUILD_OPTIONS or something similar to instruct the build
> process to create the minimalist package that you need instead of
> dropping the translation entirely from the original .deb.

We already have DEB_BUILD_OPTIONS to instruct the build process to
create the minimalist package but those don't meet the need for TDebs.

It isn't "the translation" it is dozens of translations per package - a
single package can contain 30 or 40 .mo files, between 7k and 30kb each.
An embedded device does not need all the translations, it needs one,
two, maybe four or five at most, carefully matched against the locales
supported on that one specific device. The rest are a complete waste of
storage because storage space is not cheap. Emdebian users need control
over which translations are installed - absolute control - without
bloating the download sizes of the packages themselves. Therefore,
Emdebian Crush splits out each .mo file into a dedicated .tdeb package
and provides tools to match the package against the supported locales
and the installed package list. No other component of a Debian package
needs as much specialised code - manpages, other docs, all those can be
dumped out of the package without any other considerations. Package
names can be changed to reflect changes in --enable-foo to
--disable-foo using DEB_BUILD_OPTIONS to attain reductions in the sizes
of the actual binaries and (more importantly) the dependency chain of
those binaries, other DEB_BUILD_OPTIONS can drop README and TODO etc.,
but translations are special.

The current TDeb specification also allows for Debian users to have a
similar level of control over what actually gets installed - albeit
without the ability to reduce the size of the total download. That's OK
because Debian doesn't need to care about the size of the download via
apt-get etc. - Emdebian does.

TDebs are a specialised solution to one side of the need for smaller
packages but Emdebian has other ways of reducing the size of the rest
of the package. I don't want to handle translations as "just another
component of a package", I want a solution that works as a translator.
I have a host of other methods for reducing the package size, that is
only part of the goal for tdebs.

> You can always continue to use Tdebs to distribute the translation
> that you need while installing the minimalistic .deb.

That is precisely what we already provide. Minimalised .debs (rebuilt
using cross-build support or repackaged using dpkg support) with TDebs
in a separate repository and tools that allow users to manage the
actual translations that get installed.

TDebs are a solution for translations, not for the creation of the
minimalistic .deb - however, once you have a set of TDebs for a
package, it makes no sense to retain any .mo files in the .deb.
Actually making the remainder of the .deb smaller then comes down to
use of DEB_BUILD_OPTIONS.

> > If the idea becomes separated from translations, automated creation
> > of TDebs by translation review teams would be compromised,
> > difficulties in re-assembling the source to ensure that subsequent
> > uploads retain the intervening translation updates become even more
> > complex and the drive to deliver TDebs starts to wane. (Personally,
> > I am also not at all convinced that ftp-master would consider
> > partial debs as a viable option.)
> 
> I'm sorry but the technical choice has little to do with the ability
> of translation teams to create .tdeb.

Then what is the point? TDebs are all about the translation teams
creating .tdeb - without that the whole thing is pointless.

The technical choice must rely on the needs of the translation teams.
A significant reason for adopting TDebs in Debian is to make it easier
and less intrusive for translations to be updated in a release freeze -
to allow Christian a better method than doing source NMU's on every
package using debconf when everything else is meant to be frozen. It
isn't a good idea to keep rebuilding binaries just to update the
templates file - TDebs solve that problem. It is a specialised problem
and TDebs are a specialised solution.

If the technical choice does not consider the needs of the translation
teams in deciding how to allow updates without needing source NMU's
then it is the wrong technical choice.

Generalising it to the point where the contents of the derived packages
may or may not have any relation to translation means that the
specialised translation support cannot be implemented and undermines
the entire task, to my mind, fatally.

> > A key point of agreement with ftp-master is that the TDeb can be
> > updated entirely separately from the rest of the package - this is
> > absolutely essential to all concerned. If the idea gets diluted to
> > include security updates (which almost inevitably change the
> > compiled binaries of any compiled package involved in a security
> > bug) then this update method becomes impossible and we start
> > talking about binary patches or mass-replacement of binaries by
> > partial debs. I find that idea more than a little insane. ;-) It
> > would be a sizeable regression even from where we are now with
> > binNMUs and source NMUs. This is about removing the code churn from
> > i18n updates pre-release, it is about preventing translation
> > updates during a release freeze from ever rebuilding any binaries.
> 
> Whatever you manage to do with .tdeb can be done with partial .deb… 

Except organise the internals of the file format as a collection of
locale roots and isolate translation updates from the need to rebuild
the source package.

I don't agree that anything of merit can be done with a partial .deb
that needs to be done in a .tdeb. A general solution is not the
objective. Translation updates need to be a specialised solution for a
specific need - the need to provide a method of updating translations
during a release freeze without doing source NMU's and the need to
support more explicit and more specialised translation support in
Emdebian.
 
> I propose an idea so that people can think about it and you're trying
> to argue that it can't work and that it's insane and that we
> shouldn't even think about it.

Because I did consider and test the idea of a "partial" .deb (albeit
not under that particular name) whilst developing what Emdebian is
currently using. A partial deb is not what Emdebian needs and I have
tried it. Dpkg class support, however, is a different thing entirely
and Emdebian would dearly love to see that implemented in Squeeze,
along with DEB_VENDOR support.

I also fail to see how a partial deb solves the problems of
translation updates in a release freeze.

> > However, updates to existing packages is only half the story - the
> > drive for TDebs has come from Emdebian. Emdebian must have TDebs in
> > order to meet any of the objectives for Emdebian itself. Partial
> > debs are not a solution, embedded systems need intelligently handled
> > translations and lots of smaller packages. The Emdebian solution
> > for TDebs cannot work within Debian unmodified, everyone accepts
> > this, but the Emdebian requirements do need to be achievable
> > directly from the Debian implementation.
> 
> And can you explain why "Partial debs are not a solution" ? I see no
> justification in your paragraph. Partial .deb could be used to add
> translations to any package in theory.

I did explain but you described it as "non-technical". Partial debs
do not support intelligent handling of translations because a partial
deb can contain all kinds of other stuff that has no relation to
translations. Partial debs that change binary content or package
functionality are not what we need in Emdebian or what translators
need for .tdeb and could make it a lot harder for ftp-master to accept
a translation update during a freeze.

TDeb uploads must not change anything except translations. You cannot
make that rule if you change TDebs into partial debs that could contain
who knows what. Without that rule, you cannot risk allowing partial
debs to be uploaded during a release freeze without also changing the
source of the package or rebuilding on all architectures which is where
we are at the moment with tasking Christian to make hundreds of source
NMU's. We need to remove the need for source NMU's to update
translations and allowing partial debs to contain other content
undermines the distinction.

I do not see how partial debs can deliver the specialised needs of a
tdeb (or, for that matter, a udeb that is also a partial extraction of
a .deb). I don't see the appeal of a security update being a partial
deb - to me that looks like a binary patch system ala Windows, unless
all forms of modification of the binaries is ruled out at which point
partial debs become pointless for Emdebian Crush as well where we need
to switch off certain ./configure options via DEB_BUILD_OPTIONS and
means that only a fraction of security updates can be handled as a
partial deb (those related to interpreted languages).

> > The current specification supports this - partially via the premise
> > that TDeb updates only contain changes to translations and partially
> > because the individual localisations can be easily split out to
> > create the extra packages for Emdebian (as described on the
> > Emdebian website at the link above).
> 
> You could do the same with the set of partial deb dedicated to
> localization. Don't mix technical choices with policy choices.

So you want partial debs and TDebs? I see no technical reason for
partial debs at all, especially in relation to updates during a release
freeze.

Making TDebs a subset of partial debs does not allow for the internal
re-working of the file format either. Thomas did excellent work at the
Extremadura meeting on getting dpkg to understand locale roots and
preparing a file format for .tdeb that allows easy extraction of
individual locale roots from a single .tdeb.

Emdebian has two needs - ultimately split-out TDebs for Emdebian Crush
with one tdeb per source package, per locale, per architecture, and
moderately split-out TDebs for Emdebian Grip that are 99.99% automated
by using Thomas' code to split the Debian TDeb into one tdeb per source
package per locale root (where the Gripped TDeb will remain
architecture-independent).

partial .debs have no role in TDebs, AFAICT - the approach is too
generalised when a specialised solution is required.

> > There are use-cases for such things but I am not the one to take on
> > that task.
> 
> Well, when someone ask me to modify dpkg, I try to look at the bigger
> picture because that's what makes sense.

Accepted - but I don't want to lose the specialisation to the
generalised deployment.

These are the technical criteria for TDebs:
1. An internal file format that explicitly supports locale roots for
automated extraction of specific translations only.
2. A known list of acceptable content that only relates to
translations to make it possible to add translations without changing
the Debian source package.
3. The ability to build and update translation content without any
source NMU's being necessary, especially during a release freeze.
4. No translated content outside of TDebs, including in the original
package.
5. The ability to extract individual translations into bespoke packages
without changing the rest of the package.
6. A stable and predictable nomenclature for the packages to clearly
determine how the content can be updated, built and installed.
7. A clear migration path from Lenny to Squeeze and Squeeze+1.
8. One TDeb per source package in Debian with support for more flexible
implementations in Emdebian Grip and Emdebian Crush.
 
> I know that you have a lot of energy and enthusiasim for Emdebian and
> related projects but please do not use that energy to reject outright
> any suggestion that you don't like at first sight.

I did a lot of work "out-of-sight" in Emdebian whilst the current model
was being developed. I rejected a lot of possible models for a variety
of reasons and I succeeded with tdebs because the method is based
around tried-and-tested models from other embedded solutions. It may
look like an out-of-hand rejection but I have tried models along the
lines of what you propose as partial debs. I can see the idea and my
intention is not to rule things out without consideration. However, I
have been looking at this for a long time and playing with a raft of
other implementations that simply don't work.

> > Actually, from my perspective, your idea is far from simple and
> > seems to be a lot more complex than anything so far agreed with
> > ftp-master.
> 
> >From a dpkg point of view, I find my idea simpler. :) We all have
> >biases
> but we should try to step back and see further in general.

Stepping back means seeing beyond dpkg too. This is a specialised
solution that reaches beyond just dpkg. It needs to retain the
specialised support. The changes required within dpkg are actually
quite small - the changes in TDeb support to implement your "simpler"
idea are so large that it makes the entire venture unattainable.

The bigger picture is bigger than just dpkg. ;-)

The bigger picture is what you earlier discounted as non-technical -
it is a basis for smoother translation of all parts of Debian, easier,
faster updates, shorter string freezes, less work for people like
Christian and myself, easier customisation of translation support for
specialised situations like Emdebian and includes a lot more than just
a partial deb implementation in dpkg.

-- 


Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/

Attachment: pgppd8QTtoHBG.pgp
Description: PGP signature


Reply to: