Re: Questions regarding utf-8
On Thu, May 08, 2003 at 07:50:50PM -0400, Bob Hilliard wrote:
> Some third-party dictionaries, such as foldoc and The Jargon File
> occasionally include 8 bit characters, such as 0xe7 for c-cedilla. In
> order to fix these easily, I would like to know:
>
> 1. How can I determine what character encoding is used in a
> document without manually scanning the entire file?
Hm, I can't thing of an easy way. Maybe someone knows available tools
to do that.
> 2. What is the best available filter to convert from encoding X
> to 7 bit ASCII?
That seems to be recode. You can convert directly to UTF-8, too.
> 3. What is the difference between utf-8 and en_US.utf8?
UTF-8 is a multi byte encoding of Unicode and en_US.utf8 is a locale?
--
Andreas Bombe <bombe@informatik.tu-muenchen.de> DSA key 0x04880A44
Reply to: