[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [debian-user] Converting to UTF-8 from ISO-8859



>--[Alex Malinovich]--<demonbane@the-love-shack.net>
> On Sat, 2003-06-14 at 06:58, Rüdiger Kuhlmann wrote:
> > >--[Alex Malinovich]--<demonbane@the-love-shack.net>

> > > 1) I've set up an .Xmodmap file to map my left Windows key to Multi_key
> > > so that I can type extended characters. However, I have to run "xmodmap
> > > .Xmodmap" manually every time I restart X. I'm guessing that I should
> > > put this in an X startup script. A .bashrc equivalent for X.
> > > Unfortunately, I'm not sure what the proper file to put it in is.
> > I don't know an answer to this one, but isn't the right Windows key used by
> > it by default already?
> Not on my system. xmodmap shows the two Windows keys set to Super_L and
> Super_R.

Hmm, on X at leat I can type right-windows u " and get an ű (and not an ü
as expected *sigh*).

> > > 2) Is there a way to get UTF-8 support in a regular text console?
> > Edit /etc/console-tools/config to contain a line like "SCREEN_FONT=lat0-16"
> > IIRC. And of course have LC_ALL set correctly.
> I've done this, and set LC_ALL to en_US.UTF-8, but I still can't get
> proper UTF character support in a console.

Yes, that was the wrong font. lat0-16 is just latin-0 with height 16. That's
good enough to get the €uro, that's why I remembered it. You should be
able to select a unicode font and a unicode mapping with it, using
SCREEN_FONT and APP_CHARSET_MAP, and unicode_start with the right arguments
should have the same effect. Should, because display works for me, but not
input...

> So I'm guessing that UTF-8 can use multiple bytes per character somehow?
> Just keeping the 100 or so equivalent to the ASCII characters?

Yes. The first byte will have at least the two upper bits set; the number of
bits set from bit 8 on gives the number of bytes used for this code. Each
following byte has bit 8 set, but bit 7 cleared, thus giving 6bit effective
data for each following byte. The lead byte also has a few bits effective
data, depending on the number of following bytes. See
http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8 for more details.

-- 
         100 DM =  51  € 13 ¢.
         100  € = 195 DM 58 pf.
  mailto:ruediger@ruediger-kuhlmann.de
    http://www.ruediger-kuhlmann.de/



Reply to: