Re: in NEW: utf8-migration-tool -- Debian UTF-8 migration wizard
Martin-Éric Racine wrote:
su, 2006-12-31 kello 18:55 +0500, Alexander E. Patrakov kirjoitti:
Martin-Éric Racine wrote:
Having merged Vincent's patch, I uploaded utf8-migration-tool to NEW.
Since Etch will be Debian's first UTF-8 release - implying a migration
from legacy encodings for those upgrading from Sarge, which is precisely
what this tool tackles - it would be nice to approve it for Etch.
1) patrakov@home:~$ utf8migrationtool
Unexpected error: exceptions.IOError
Traceback (most recent call last):
File "/usr/bin/utf8migrationtool", line 40, in ?
dmrc = getconfig()
File "/usr/bin/utf8migrationtool", line 34, in getconfig
IOError: [Errno 2] No such file or directory: '/home/patrakov/.dmrc'
Works fine here, so no comment.
This is because you have the .dmrc file. I don't (I created an empty file to
get past this error when writing my first mail). This file presumably
belongs to gdm, but I don't have gdm (I use "startx"), and your package
installs fine without gdm. Missing dependency?
2) The tool must handle the already-migrated case better (e.g., by adding a
line about that onto the second screen).
It does. Here, it says that the locale is already migrated. It also says
that it cannot find any files utilizing a legacy encoding.
Yes, it does, in the case when the old locale is from .dmrc.
3) The legacy locale for Russia is ru_RU.KOI8-R, not ru_RU, and the
migration tool must handle this special case.
Russian is a messy case. Too many encodings, more than half of which are
OS-specific or otherwise standards that never gained momentum. This is
further complicated by usage cases: while Unices tend to go for KOI8-R,
users that need to interact with Windows use CP1251 instead. Still, it's
up to Russian developers to add support for this; upstream simply cannot
anticipate every possible exception.
OK, I temporarily take this back (because the old report was based on empty
.dmrc - but anyway, you could take the .KOI8-R part from $LANG). However, I
replace my old report with this: when the old .dmrc contains
the migration tool migrates this to ru_RU.KOI8-R.UTF-8 which is wrong. Also
it migrates de_DE@euro to de_DE@euro.UTF-8.
The locale names generally have the form:
ll_CC.CODESET@modifiers (where .CODESET and @MODIFIERS may or may not be
present). The old codeset and the @euro modifier (but probably not other
modifiers) must be stripped out.
4) migration of encodings is only a part of the game. The most important
part is to deal with packages that do not work correctly in UTF-8 locales
and cannot be fixed (e.g., a2ps). Since this part cannot be automated (as
nobody has created such blacklist), I suggest mentioning this obstacle in
the manual page and on the welcome screen.
Remaining UCS issues really belong in Etch's release notes, since it is
Debian's first release claiming UTF-8 support.
Yes, they do. However, not everyone reads the release notes, so why not
point users to them explicitly on the welcome screen?
Thus, I cannot recommend migration of this package to Etch in its current shape.
And I still say this.
Alexander E. Patrakov