[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Draft] Writing i18n apps with glibc 2.2

On Sun, Oct 15, 2000 at 07:09:08PM -0700, rigel wrote:
> Hi Roger, here are some of my thoughts on your draft.


> On Sat, Oct 14, 2000 at 06:06:08PM +1100, Roger So wrote:
> > 
> > locales; it is now more POSIX compliant. [is it?]  However, as a result,
> > the semantics of many library calls have changed.  The following is a
> I'm not aware any semantic change of any library call. Could you elobrate?

Technically no change at all; however the improved locales created a side
effect that many of the i18n functions' return values are different.  
Maybe this is a little specific to Chinese locales, where things like 
zh_TW.Big5 have changed to zh_TW etc ...

Perhaps I should've just wrote "The improved locales simply breaks a whole
bunch of programs"...

> Good point. However I don't think it's glibc 2.2 issue. Actually if you stick
> with glibc only this is not a very big issue, because you always know how
> the locales are named (well, kind of). It's a serius issue when you want your
> program also runs on other unix systems with their native C library. That's
> when the naming conventions becomes wild. So if portability is a concern,
> nl_langinfo should be used no matter what C library you work with.

One thing I forgot to mention is, one still needs to use setlocale(3) to 
set the locale before using nl_langinfo(3) ... just don't trust the string
returned by setlocale() too much.

> >  2. isprint(3) vs. iswprint(3): To test whether a byte is printable, use
> >     iswprint(3).  To test whether a character is printable, use
> >     isprint(3).  Take 0xA7DA ('?? in Big5) for example: iswprint(0xA7),
> No, please don't. You should continue to use isprint to test whether a byte
> is printable.

I thought so too, but isprint(0xA7) didn't work, however iswprint(0xA7) 
worked ...?  Now I'm confused ...

> The iswprint is only supposed to work with widechar. iswprint
> was not available in glibc 2.1.x, also added to glibc2.2 are whole bunch of
> isw* functions and widechar I/O functions (such as wprintf). These functions
> are long waited, just ask those mutt developers!

Heh, I have a lot of problems with mutt 1.2.5 and glibc 2.2 ... I was meant
to check out mutt 1.3 CVS, but with exams looming ...

> Just a small thing. A new LC_CTYPE class "hanzi" was added in glibc 2.2
> locale (both zh_CN and zh_TW have it, zh_HK doesn't though).

Hmm ... that's a bug ...

> It currently
> contains all the unicode unihan characters. I don't know how useful this
> is (your comments on this are welcomed). In case some one is interested in
> this, here is a sample implementation of "iswhanzi(wchar_t wc)"

Well, this is certainly useful for some programs; no need to have the
Unicode spec on hand to see if some character is a hanzi. :p  But as
hashao pointed out, is this portable?

Again, thank you for your comments.


  Roger So                                            telnet://e-fever.org
  spacehunt at e-fever dot org                          SysOp, e-Fever BBS
  GnuPG  1024D/98FAA0AD  F2C3 4136 8FB1 7502 0C0C 01B1 0E59 37AC 98FA A0AD

Reply to: