Bug#99933: second attempt at more comprehensive unicode policy
On Mon, 2003-01-06 at 16:15, Jochen Voss wrote:
> Hello Colin,
>
> On Fri, Jan 03, 2003 at 09:50:26PM -0500, Colin Walters wrote:
> > In summary, UTF-8 is the *only* sane character set to use for
> > filenames.
> At least I agree to this :-)
Cool.
> I think that we need filename conversion between UTF-8 and the user's
> character set, because we cannot ban all non-UTF8 terminal types. In
> my opinion the main problem is, where this conversion should take
> place.
I will say this much; I simply did not even consider doing this kind of
character set conversion as part of glibc or Linux. It just seems like
such a horrible kludge that would not actually work in practice.
Fundamentally, glibc and Linux cannot know what charset the application
itself works in. You might have stuff that undergoes UTF-8 conversion
*twice*, once by the application and once by glibc for example. It just
seems like a recipie for disaster.
> Because a lot of programs is affected, it would gain us much, if we
> could move this as deep as into libc or even into the kernel.
Again: I argue that we need to change all these programs *anyways*,
because you can't just use your same old C library string functions on
UTF-8. I know it seems tempting to just stick some code into glibc, but
I have serious doubts that will ever work in anything resembling a
reliable fashion.
Feel free to prove me wrong of course!
> Does anybody know: how do they solve the problems we discuss here?
> Where do they convert filenames, e.g. when I login via ssh and
> type "ls -l Bär*" from my LC_CTYPE=ISO-8859-15 system?
I think that it quite simply does not work.
> > Again, major chunks of upstream software which have Unicode support
> > (like GNOME), are *already* defaulting to interpreting filenames as
> > UTF-8 by default.
> And how is the conversion done there?
What conversion? GNOME apps speak UTF-8 natively, and that's about all
they speak unless you set the G_BROKEN_FILENAMES environment variable.
Reply to: