[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#235759: libc6: iconv's replacement for "German quotes in UTF-8" to latin1



On Fri, Mar 12, 2004 at 10:26:44PM +0900, GOTO Masanori wrote:
[...]
> So, default ` ' pair will become problematic under LANG=C, don't you?

Huh?  My patch only changes transliteration of U201E and U201C
characters under de locales.

[..]
> I searched about this original mapping.  ISO-5426 double open quote
> seems being assigned to U201E:
> 
> 	http://www.niso.org/international/SC4/Wg1_240.pdf
> 	http://www.issn.org:8080/English/pub/tools/format/characters

Unicode quotes are described in http://www.unicode.org/book/ch06.pdf p152.
It mentions for German (U201E,U201C) as well as French guillemets, in
which case it is identical to Slovenian example (this is not illustrated
but can be deduced from reading explanations about German quotes).
So it should now be pretty clear that:
  * U201E and U201C are normal German quotes (only available under
    UTF-8)
  * U00BB and U00AB are also common in German.

> W3c example which uses ` ,, '.
> 
> 	http://www.w3.org/TR/2004/CR-CSS21-20040225/generate.html
> 
> Markus Kuhn's transliteraion table uses another characters ` " '
> (maybe high position character).
> I think ,, is near character with U201E.

These tables are there to perform best-effort representations of any
Unicode text; of course better alternatives can exist depending on
local conventions, which is the case here when you know that you
are translating German text.

> And is this modification gotten agreement from many German users?

Given the ugly rendering of UTF-8 quotes under ISO-8859-15 locale, they
discuss whether French guillemets should be used even in UTF-8 encoded
PO files to work around this problem.  Then quotes are not perfect for
UTF-8 folks, but at least ISO-8859-15 people do not have to wonder what
those commas are for.  AFAICT nobody dares to suggest that current
transliteration does the right thing ;)

> I wonder this proposal is not well inspected.  I would like to reject
> both bugzilla and BTS unless you provide more information.

Please reconsider your position.
There are still questions for German people: why was this issue not
raised before?  As for French, I guess that most PO files are ISO-8859-1
encoded, but some projects (e.g.  KDE and GNOME) only accept UTF-8
encoded PO files, so they are hit by this bug for some time now.
I would say that either they decided to work around it by using French
guillemets or decided not to support ISO-8859-1, but do you have
pointers for such decisions?
Also SuSE does not seem to fix it too, do you know why?

Denis



Reply to: