Re: locales and coding systems

To: leandro@dutra.fastmail.fm
Cc: debian-user@lists.debian.org
Subject: Re: locales and coding systems
From: brownh@hartford-hwp.com (Haines Brown)
Date: Sun, 4 Jan 2004 07:41:36 -0500 (EST)
Message-id: <[🔎] 20040104124136.0DF48BE2@teufel.hartford-hwp.com>
Reply-to: "H.Haines Brown" <brownh@hartford-hwp.com>
In-reply-to: <[🔎] 1073205785.2229.3741.camel@dutras.dyndns.org> (message from Leandro Guimarães Faria Corsetti Dutra on Sun, 04 Jan 2004 06:43:07 -0200)
References: <20031208213117.98A5967D@teufel.hartford-hwp.com> <20031209023923.GA12473@doorstop.net> <20031209115047.B8C607E5@teufel.hartford-hwp.com> <20031212092220.GF16285@riva.ucam.org> <20031212181917.183631CE@teufel.hartford-hwp.com> <20031212195730.GA25152@riva.ucam.org> <20031212215948.3F95B1CE@teufel.hartford-hwp.com> <20031213022746.GC452@doorstop.net> <20031213102547.94C3BA6B@teufel.hartford-hwp.com> <[🔎] pan.2004.01.03.18.05.00.81535@dutra.fastmail.fm> <[🔎] 20040104012219.5E5CE5C3@teufel.hartford-hwp.com> <[🔎] 1073205785.2229.3741.camel@dutras.dyndns.org>

> Em SÃ¡b, 2004-01-03 às 23:22, Haines Brown escreveu:
> > I
> > do have a few files that emacs has trouble with, probably 16-bit, but
> > they are exceptional, and I know how to handle utf-16 in emacs and
> > convert those files to useful form. I've just not had the time to play
> > with the one difficult file now troubling me.
> 
> 	To convert the encoding of a file, open it and C-x RETURN f, is that
> what you're using?

Just a little context here. I'm running emacs 21.2.1. C-h C tells me
my current default coding system is utf-8; my language environment is
en_US.UTF-8. I can insert here in this message or into a blank file an
extended character, such as c-cedilla: ç.

OK, that seems to mean that emacs is working properly, and my problem
has to do instead with a problematic file. This file is a plain text
message, but it is two years old, and who knows what I may have done
to it?

In my problematic file, the extended characters appear as
octals. Initially I tried to so a search/replace to convert the octals
into proper characters, but emacs would not accept the octals as a
search term. I could not search for the \347 and replace it with a
c-cedilla because the \347 I pasted into the minibuffer was not really
a \347 octal, but only looked like it. Since normally I can paste an
octal as a search term, there's something about these octals that is
not right. 

I first assumed that the coding sytem of the problmatic file was not
being handled by emacs properly, and I sought a way to convert the
file into useful form. I suspected maybe the file was somehow defined
for a coding system that emacs did not undertand.

I tried two things. First, I tried to open the problematif file as
utf-16-le (C-x RET c utf-16-le) and then save it as utf-8 (C-x RET f
utf-8).  

Now, instead of octals, the extended chars in the utf-8 file appear
instead as empty rectangles. So nothing gained, and perhaps
information lost. However, there was another difference, perhaps more
significant. In the original file that I suspected was utf-16-le, I
could not insert a c-cedilla, which appeared as \347. However, when I
saved the file using the utf-8 encoding system, I could now insert the
c-cedilla properly. 

I did another experiment. Instead of saving the problematic file as
utf-8, I saved it as iso-latin-1. This saved file still had the octal
characters, and an inserted c-cedella still appeared as \347. In other
words, saving the file as iso-latin-1 did nothing. Am I correct to
infer that the original document was probably latin-1 and therefore
the problem is not the document's coding system? 

How does one reveal file attributes beyond what is conveyed by ls -l?
There's a lot more attributes than it displays. I perhaps should also
display the file in hex-mode to see what the characters look like.

Haines Brown

Reply to:

Follow-Ups:
- Re: locales and coding systems
  - From: Antonio Rodriguez <arodriguez31@cfl.rr.com>
- Re: locales and coding systems
  - From: "Leandro Guimarães Faria Corsetti Dutra" <leandro@dutra.fastmail.fm>

References:
- Re: locales and coding systems
  - From: Leandro Guimarães Faria Corsetti Dutra <leandro@dutra.fastmail.fm>
- Re: locales and coding systems
  - From: brownh@hartford-hwp.com (Haines Brown)
- Re: locales and coding systems
  - From: Leandro Guimarães Faria Corsetti Dutra <leandro@dutra.fastmail.fm>

Prev by Date: mouse ps/2 on console
Next by Date: Getting ext3 correctly in the kernel or boot
Previous by thread: Re: locales and coding systems
Next by thread: Re: locales and coding systems
Index(es):
- Date
- Thread