[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: console translator set without encoding

Today at 15:19, Marco Gerards wrote:

>> Filenames are 8-bit ASCII compatible strings (UTF-FS as in
>> "filesystem-safe" originally), and that's all you need to know to make
>> POSIX-compliant programs.

In recent discussions on austin-group@opengroup.org, someone mentioned
that only "/" is forbidden in POSIX filenames, which would mean that
my claim above is not really correct: even multibyte encodings are
allowed, as long as they don't contain "/" (nulls are ok, I believe).
ASCII is completely irrelevant.

Basically, it's all fine and dandy as long as your bytes can
accomodate at least 8 bits, and you don't use '/' (47) in there.

> This unfortunately fully depends on the filesystem.  Or do you mean
> the interface to the filesystem?

Indeed, but I'm talking about filesystem interfaces in POSIX systems
actually.  I know of no system which takes special handling such as
normalising at filesystem level (perhaps Plan9?).

All systems which have switched to using some Unicode transformation
format in the real on-disk data (Plan9fs UTF-8, NTFS UTF-16, Gnome
UTF-8) have also switched their entire internal data handling to those
formats as well.  I.e. you can supposedly run them in ISO-8859-1
locale, but they'd still work with whatever they're using internally,
and only convert to your desired locale when needed.


Reply to: