[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#99933: Bug#174982: [PROPOSAL]: Debian changelogs should be UTF-8 encoded



On Tue, 2003-01-07 at 11:58, Denis Barbier wrote:
> On Tue, Jan 07, 2003 at 10:23:14AM -0500, Colin Walters wrote:
> [...]
> > It looks to me like at this point almost everyone agrees with the
> > content of my proposal in #99933, and we are discussing implementation
> > details.  Agreed?
> 
> No.  We agree that UTF-8 support must be dramatically improved, but
> legacy encodings must be supported too.

Sure...but remember that my policy proposal does not drop support for
legacy charsets; in fact it recommends that programs try falling back to
them if UTF-8 decoding fails.  

I see this policy proposal as a strong statement that Debian is moving
towards Unicode, not as a means to get packages which don't grok UTF-8
removed from Debian or something silly like that.  Implicitly in this is
that we will support legacy encodings to some extent for a while.

Do you agree?

> I was unclear, and only speaking about files shipped by Debian packages
> which contain non-ASCII characters without specifying their encoding.

Ok.

> Users can do whatever they want with their data.

Agreed completely.  They can have their data in any encoding they want,
as long as it's UTF-8. :)  

(just kidding...)

> I have almost txt, man and info pages in mind.  IIRC *BSD put man pages
> under .../man/<language>.<encoding>/, don't they?  Info pages are never
> translated.  The only text files with non-ASCII letters I encounter
> are documentation and can be safely renamed, but maybe there are others.

Ah, OK.  I think that improving how our documentation formats specify
charsets is a great goal.  I misunderstood your proposal.

> > but instead we could add support to programs to autodetect the charset
> > semi-intelligently from file content, which is what programs like Emacs
> > in the real world do today.
> 
> Then why do you patch dpkg to support UTF-8 input if it can guess encoding?

Er...my patch was to support outputting UTF-8 to the user's terminal. 
There was no input involved.  I think you may have confused something
somewhere, but maybe I just wasn't clear about what it does...




Reply to: