[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Questions regarding utf-8



Bob Hilliard <hilliard@debian.org> wrote:
>     Thanks to all who replied to my recent question on this subject.
>     Andreas Metzler <ametzler@downhill.at.eu.org> wrote:
>> With glibc I'd use
>> iconv --from=SRC-ENCODING --to=DST-ENCODING//TRANSLIT
>> if it is acceptable to change the length of strings. This will replace
>> e.g. the Euro-Symbol with "EUR".

>     Without //TRANSLIT, iconv fails if DST-ENCODING is US or ASCII,
> but with //TRANSLIT, all characters that aren't included in ASCII are
> rendered as `?'.

I was not aware of that, but you are right.

> This useful, but not as useful as the conversions
> performed by recode.

--------------
*prompt* echo ö§ | recode latin1..ascii
"oSS
*prompt* echo ö§ | iconv -f latin1 -t
ascii//TRANSLIT ; echo $?
oe?
--------------
»oe« is much better than »"o« and »SS« is no usable replacement for
»§«  (I do not think there is one), it would be nice if iconv's
exit-status reflected whether questionmarks were used, but changing
this would probably break existing software.

> Where is `//TRANSLIT' documented?

In former times it was documented in the manpages but afaict it is not
documented anywhere anymore (I checked the respective manpages and the
contents of glibc-doc 2.2.5-11.2)

*prompt* zgrep -li translit `dlocate -L glibc-doc`
/usr/share/doc/glibc-doc/ChangeLog.11.gz
/usr/share/doc/glibc-doc/ChangeLog.12.gz
/usr/share/doc/glibc-doc/ChangeLog.10.gz
                cu andreas
-- 
Hey, da ist ein Ballonautomat auf der Toilette!
Unofficial _Debian-packages_ of latest unstable _tin_
http://www.logic.univie.ac.at/~ametzler/debian/tin-snapshot/



Reply to: