[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#99933: Bug#174982: [PROPOSAL]: Debian changelogs should be UTF-8 encoded



On Fri, 2003-01-03 at 11:45, Radovan Garabik wrote:

> > #99933 goes a lot farther than #174982.  First of all, we can't even
> > suggest that people use UTF-8 in package control fields until all our
> > tools support it.  Right now it is just plain broken to put anything but
> > ASCII in them.
> 
> But people are putting ISO-8859-1 there, now and then.

Yes, and it is fundamentally broken to do so, because our tools do not
support it.  Displaying it might happen to work on the maintainer's
machine, but it will probably fail in many more places around the world,
where people use terminals with a different native encoding type.

> And I am going to use UTF-8 for Maintainer: in my packages, once
> I have new stable mail address (and new UTF-8 GPG alias)

Please only use ASCII until the tools support it, and file bugs against
packages with control fields with characters not in ASCII.  Otherwise
you are just worsening the problem by adding yet another encoding to the
mix of ISO-8859-1, ISO-8859-2, and who knows what else is already there.

> > I also personally don't like how it recommends using a "well-established
> > encoding" or UTF-8.  I mean, that's basically saying nothing.  It
> 
> well, the whole proposal was a compromise after a long and bloody
> flamewar :-)

I understand that, but I think we can just avoid the issue of general
file encodings for now, and only work on particular bits like
distributed documentation and filenames.

> > doesn't help applications at all, which will still be forced to guess
> > what encoding files are in.  In short, it doesn't improve the situation
> 
> It does help users, though. Most users are strictly monolingual
> (English does not count) and use the well-established encoding.

How does it help users?  It's basically saying "the current broken
situation is OK, but you may also unbreak your files if you want". 
Putting this in policy doesn't help anyone at all.  I mean, "well
established" alone is a very vague criteria.

Let me ask this another way; what change do you expect to happen by
saying that files may be in the "well established" encoding or UTF-8? 
It would basically be validating the current practice, which I consider
broken.  Policy shouldn't endorse it.

I think a better approach is just for policy to be silent on the general
encoding issue, set up a general Unicode infrastructure, start pushing
UTF-8 where it is really needed (like filenames), and let the pressure
build.

Do you agree?

> If you manage to persuade relevant persons (Manoj?). Good luck :-)

So far I haven't seen any objections...





Reply to: