[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Chinese big5 encoding and PO files



Denis Barbier:

> Err, ascii(7) tells me that 0x5C *is* a backslash.

Yes, but these documents aren't ASCII, so 0x5C may not or may not be a
backslash there, depending on where they are located in the file.

> Could you please have a look at chinese/po/others.zh.po and tell me
> what to do with Subscribe/Unsubscribe translations?

Nothing should need to be done, since the 0x5C byte is the trail byte
of the character, a proper MBCS aware string scanner will recognize
that it is not a backslash character (unlike, for instance, in the
"please respect the ad policy" string a bit further down, which *does*
contain a backslash in the translation). Getting the string scanner to
work properly requires configuring the locales properly.

Big5 is a bit problematic since it allows non-highbit characters as
trail bytes, similar to the problems with ISO 2022-JP. A stateful
string scanner is required to handle it properly. LibC should work fine
as long as the proper locale is available, and I am pretty sure that
the gettext utilities will handle this properly.

-- 
\\//
Peter - http://www.softwolves.pp.se/
  I do not read or respond to mail with HTML attachments.



Reply to: