Re: character encoding
On Tue, Jan 01, 2008 at 05:52:07AM +0100, Vincent Lefevre wrote:
> On 2007-12-31 15:08:24 -0800, Kelly Clowers wrote:
> > On Dec 31, 2007 1:41 PM, ChadDavis <email@example.com> wrote:
> > > 3) What is the encoding of the file name? Is this a feature of the
> > > filesystem?
> > This is also based on your locale.
> And this is nasty: This means that if the user changes his locales
> (or use different locales depending on the context), he will get
> buggy filenames; this is also the case with system scripts that run
> under the C locale. Also, different users using different locales
> won't easily be able to share files.
> Workaround 1: don't use non-ASCII characters in filenames. This
> may not be very user-friendly, but this is 100% compatible with
> Workaround 2 (if ASCII isn't sufficient): always use UTF-8. But be
> careful about the normalization problems (NFC/NFD...). Linux can't
> handle that, so that you may get several files with the same name
> (but encoded differently) in the same directory.
Workaround 3: get rid of (purge) the whole locales thing and stick with
'C'. Had the advantage of at least doubling the apparent speed of my