[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#99324: Default charset should be UTF-8



On Sat, Jun 02, 2001 at 10:53:34PM +0300, Aigars Mahinovs wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: RIPEMD160
> 
> Hello Anton,
> 
> Saturday, June 02, 2001, 2:22:20 PM, you wrote:
> > Well, I am not Russian but my impressions show that generally the
> > Russian people are not against Unicode.  It's not pleasant to deal with
> > so many incompatible 8-bit Cyrillic encodings.  When Unicode becomes a
> > real alternative to all Cyrillic encodings I guess that most of the
> > Cyrillic users (not only Russians) will switch to Unicode.
> 
> I'm Russian AND Latvian.
> 
> I second this proposal.
> 

I second it too.
However, the proposal is too vague and generic, and
good only as a long term solution.
I suggest following roadmap:

1) Advice using utf8 in documentation. This could 
   even go into debian policy for woody, though probably it will get into 
   woody+1
2) Require using utf8 in debian control files (debian/changelog, debian/control,
   Packages). This is not such a great change as it seems, since it will mean
   only replacement of a few characters in a few packages (currently using 
   iso-8895-1). Woody+1, maybe woody too.
3) Require using utf8 in documentation. Definitely should go into woody+1.
4) Fix terminfo entry for linux console so that it deals with unicode
   properly - easy, I have found such an entry on internet once but
   lost it afterwards. Woody+1.
5) Fix readline library. Not too difficult, there was a patch doing this
   for bash (see #25131). Woody+1.
6) Fix stty to accept iutf8 parameter. I do not know what it involves. Woody+1.   
7) fix passwd routines to accept 8 bit characters in GECOS. Easy, woody+1.
8) Require locales be to in utf8, if console is in other national encoding, 
   let glibc do the recoding (glibc already does this, though in a somewhat limited
   and buggy fashion). Probably will come with the evolution of glibc. Woody+1 or Woody+2.
9) boot floppies switch console into utf8 and go from there. Woody+2, maybe woody+1
   if things go fast (or woody+1 goes slow :-)). Not so difficult as it seems.
   This needs debconf templates to be in utf8, it can be done simultaneously.
10) recode other language manpages into utf8. Together with 11.
11) default console (and xterm) encoding is utf8. Woody+3, maybe woody+2 if 
    things go well.
12) by this time, linux will probably evolve to the point that it supports 
    more than 512 characters in console font (framebuffer, needs some adjusment 
    of console-tools). This allows painless support of CJK languages.
    After wody+3
13) And if somebody makes linux console work with right-to-left scripts,
    add support for it too, and gain hebrew and arabic. The next generation :-).

The good poin is that we can do 1)..4) easily, and it will mean an immediate
advantage over current situation, regardless of whether we dare to go on
from 5) later.

I will write a formal proposal for 1).

-- 
 -----------------------------------------------------------
| Radovan Garabik http://melkor.dnp.fmph.uniba.sk/~garabik/ |
| __..--^^^--..__    garabik @ melkor.dnp.fmph.uniba.sk     |
 -----------------------------------------------------------
Antivirus alert: file .signature infected by signature virus.
Hi! I'm a signature virus! Copy me into your signature file to help me spread!



Reply to: